python - Placing every value in its percentile in Pandas -


consider series following percentiles:

> df['col_1'].describe(percentiles=np.linspace(0, 1, 20))  count      13859.000000 mean         421.772842 std        14665.298998 min            1.201755 0%             1.201755 5.3%           1.430695 10.5%          1.438417 15.8%          1.466462 21.1%          1.473050 26.3%          1.500834 31.6%          1.512218 36.8%          1.542935 42.1%          1.579845 47.4%          1.647162 50%            1.690612 52.6%          1.749047 57.9%          1.955589 63.2%          2.344475 68.4%          3.075641 73.7%          4.466094 78.9%          8.410964 84.2%         14.998738 89.5%         41.363612 94.7%        162.865079 100%     1511013.790233 max      1511013.790233 name: col_1, dtype: float64 

i column col_2 percentile each row assigned in calculation made above.

how can in pandas?

df2 = pd.dataframe(range(1000)) df2.columns = ['a1'] df2['percentile'] = pd.qcut(df2.a1,100, labels=false) 

or leave out labels see range


note in python 3, pandas 0.16.2 (latest version of today), need use list(range(1000)) instead of range(1000) above work.


Comments

Popular posts from this blog

PHP DOM loadHTML() method unusual warning -

python - How to create jsonb index using GIN on SQLAlchemy? -

c# - TransactionScope not rolling back although no complete() is called -