python - ValueError: invalid literal for float(): when inserted substring from "2015-05-21T18:11:55" into dataframe -
i have key value pair in json-derived dictionary looks this:
u'local_start_time': u'2015-05-21t18:11:55.000z'
when try insert portion of string dataframe error:
file "fix_runs_prepare.py", line 63, in <module> df.set_value(i, name, str(g[name])[0:19]) file "/usr/local/lib/python2.7/dist-packages/pandas/core/frame.py", line 1679, in set_value engine.set_value(series.values, index, value) file "pandas/index.pyx", line 118, in pandas.index.indexengine.set_value (pandas/index.c:3382) file "pandas/index.pyx", line 132, in pandas.index.indexengine.set_value (pandas/index.c:3264) file "pandas/src/util.pxd", line 60, in util.set_value_at (pandas/index.c:15472) valueerror: invalid literal float(): 2015-05-21t18:11:55
this when inserting using call:
run_info = df['run_info'] in range(len(df['run_info'])): g = run_info[i] name in name_list: if g.get(name): if name 'local_start_time': df.set_value(i, name, str(g[name])[0:19]) else: df.set_value(i, name, g[name])
i same error if don't first cast string:
df.set_value(i, name, g[name][0:19])
on other hand if insert string literal "baloney" not error. think funky going on because string using begins number . that's why tried explicitly casting str()
since didn't work i'm out of ideas. else should try?
**addendum here df.head()
_id country id_2 location_fail no_location \ 0 55721992afe58716147ed3e8 nan 212508 nan 1 1 55721992afe58716147ed3e9 nan 212508 nan nan 2 55721992afe58716147ed3ea nan 212508 nan nan 3 55721992afe58716147ed3ec nan 400134 1 1 4 557219d4afe58716147edbd4 poland 513751 nan nan run run_info gender \ 0 526956965 {u'tagged_users': [], u'hashtags': [], u'feed_... nan 1 512136570 {u'tagged_users': [], u'hashtags': [], u'feed_... nan 2 510056284 {u'distance': 0.0, u'playlist': [], u'author':... nan 3 525398093 {u'motivation': {u'duration': 1.5, u'distance'... nan 4 477634373 {u'tagged_users': [], u'hashtags': [], u'speed... nan weight height ... descent calories heart_rate heart_rate_max steps \ 0 nan nan ... nan nan nan nan nan 1 nan nan ... nan nan nan nan nan 2 nan nan ... nan nan nan nan nan 3 nan nan ... nan nan nan nan nan 4 nan nan ... nan nan nan nan nan notes speed_avg heart_rate_avg speed_max local_start_time 0 nan nan nan nan nan 1 nan nan nan nan nan 2 nan nan nan nan nan 3 nan nan nan nan nan 4 nan nan nan nan nan
the problem pandas.dataframe
treats datatype of cells object
, try infer datatype if don't specify explicitly.
to avoid that, explicitly set datatype of columns want to, using dataframe.astype
:
df[[name]] = df[[name]].astype(str) # or df[[name]] = df[[name]].astype(float)
Comments
Post a Comment