python - hashes memory usage is higher than instagram test result -


hi new redis , following instagram engineering blog optimization purpose.i tested memory usage 1 milion keys storage through hashes(1000 hashes having 1000 keys each).according instagram post here it took 16mb of storage space test took 38mb.can tell me going wrong?

here test code:

# -*- coding: utf-8 -*- import redis #pool=redis.connectionpool(host=127.0.0.1,port=6379,db=4) num_entries=1000000 max_val=12000000 def createdata(min,max,userid):         r_server=redis.redis(host='localhost',port=6379,db=5)         p=r_server.pipeline()         in xrange(0,1000):                 j in xrange(0,1000):                         p.hset('follower:%s' % (i),j,j)         p.execute()         size = int(r_server.info()['used_memory'])         print '%s bytes, %s mb' % (size, size / 1024 / 1024) 

redis info :

# server redis_version:2.8.9 redis_git_sha1:00000000 redis_git_dirty:0 redis_build_id:a9b5dff7da49156c redis_mode:standalone os:linux 3.19.0-15-generic x86_64 arch_bits:64 multiplexing_api:epoll gcc_version:4.9.2 process_id:11037 run_id:c069c22be15f6b7cbd6490cea6d4ca497d8ad7cb tcp_port:6379 uptime_in_seconds:230666 uptime_in_days:2 hz:10 lru_clock:8643496 config_file:  # clients connected_clients:1 client_longest_output_list:0 client_biggest_input_buf:0 blocked_clients:0  # memory used_memory:41186920 used_memory_human:39.28m used_memory_rss:60039168 used_memory_peak:256243984 used_memory_peak_human:244.37m used_memory_lua:33792 mem_fragmentation_ratio:1.46 mem_allocator:jemalloc-3.2.0  # persistence loading:0 rdb_changes_since_last_save:0 rdb_bgsave_in_progress:0 rdb_last_save_time:1434659507 rdb_last_bgsave_status:ok rdb_last_bgsave_time_sec:0 rdb_current_bgsave_time_sec:-1 aof_enabled:0 aof_rewrite_in_progress:0 aof_rewrite_scheduled:0 aof_last_rewrite_time_sec:-1 aof_current_rewrite_time_sec:-1 aof_last_bgrewrite_status:ok aof_last_write_status:ok  # stats total_connections_received:21 total_commands_processed:3010067 instantaneous_ops_per_sec:0 rejected_connections:0 sync_full:0 sync_partial_ok:0 sync_partial_err:0 expired_keys:0 evicted_keys:0 keyspace_hits:10 keyspace_misses:0 pubsub_channels:0 pubsub_patterns:0 latest_fork_usec:2774  # replication role:master connected_slaves:0 master_repl_offset:0 repl_backlog_active:0 repl_backlog_size:1048576 repl_backlog_first_byte_offset:0 repl_backlog_histlen:0  # cpu used_cpu_sys:264.43 used_cpu_user:110.01 used_cpu_sys_children:0.27 used_cpu_user_children:1.55  # keyspace db5:keys=1000,expires=0,avg_ttl=0 

this due server using default setting of hash-max-ziplist-entries since you're store 1000 fields - here's little test ran using snippet:

foo@bar:/tmp$ redis-cli config hash-max-ziplist-entries 1) "hash-max-ziplist-entries" 2) "512" foo@bar:/tmp$ time python so.py  56791944 bytes, 54 mb  real    0m23.225s user    0m18.574s sys 0m0.377s foo@bar:/tmp$ redis-cli config set hash-max-ziplist-entries 1000 ok foo@bar:/tmp$ redis-cli flushall ok foo@bar:/tmp$ time python so.py  9112080 bytes, 8 mb  real    0m28.928s user    0m18.663s sys 0m0.315s 

Comments

Popular posts from this blog

python - How to create jsonb index using GIN on SQLAlchemy? -

PHP DOM loadHTML() method unusual warning -

c# - TransactionScope not rolling back although no complete() is called -