python - Placing every value in its percentile in Pandas -

- September 15, 2010

consider series following percentiles:

> df['col_1'].describe(percentiles=np.linspace(0, 1, 20))  count      13859.000000 mean         421.772842 std        14665.298998 min            1.201755 0%             1.201755 5.3%           1.430695 10.5%          1.438417 15.8%          1.466462 21.1%          1.473050 26.3%          1.500834 31.6%          1.512218 36.8%          1.542935 42.1%          1.579845 47.4%          1.647162 50%            1.690612 52.6%          1.749047 57.9%          1.955589 63.2%          2.344475 68.4%          3.075641 73.7%          4.466094 78.9%          8.410964 84.2%         14.998738 89.5%         41.363612 94.7%        162.865079 100%     1511013.790233 max      1511013.790233 name: col_1, dtype: float64

i column col_2 percentile each row assigned in calculation made above.

how can in pandas?

df2 = pd.dataframe(range(1000)) df2.columns = ['a1'] df2['percentile'] = pd.qcut(df2.a1,100, labels=false)

or leave out labels see range

note in python 3, pandas 0.16.2 (latest version of today), need use list(range(1000)) instead of range(1000) above work.

Search This Blog

Yet

python - Placing every value in its percentile in Pandas -

Comments

Post a Comment

Popular posts from this blog

swift - How to change text of a button with a segmented controller? -

python - How to create jsonb index using GIN on SQLAlchemy? -

javascript - Fabric.js copy paste of selected group -