1 object数据类型是dataframe中特殊的数据类型,当某一列出现数字、字符串、特殊字符和时间格式两种及以上时,就会出现object类型,即便把不同类型的拆分开,仍然是object类型.
如下replace()函数改变数据类型后,用astype()函数再转化一次才能将object格式转化,但有的时候不用.
print(train.info())
train['repay_date'] = train['repay_date'].replace("\N",'2020-01-01')
train['repay_date'] = pd.to_datetime(train['repay_date'])
train['repay_amt'] = train['repay_amt'].replace("\N",0)
train['repay_amt'] = train['repay_amt'].astype(float)
print(train.info())
# <class 'pandas.core.frame.DataFrame'>
# RangeIndex: 1000000 entries, 0 to 999999
# Data columns (total 7 columns):
# user_id 1000000 non-null int64
# listing_id 1000000 non-null int64
# due_date 1000000 non-null datetime64[ns]
# due_amt 1000000 non-null float64
# repay_date 1000000 non-null object
# repay_amt 1000000 non-null object
# order_id 1000000 non-null int64
# dtypes: datetime64[ns](1), float64(1), int64(3), object(2)
# memory usage: 53.4+ MB
# None
# <class 'pandas.core.frame.DataFrame'>
# RangeIndex: 1000000 entries, 0 to 999999
# Data columns (total 7 columns):
# user_id 1000000 non-null int64
# listing_id 1000000 non-null int64
# due_date 1000000 non-null datetime64[ns]
# due_amt 1000000 non-null float64
# repay_date 1000000 non-null datetime64[ns]
# repay_amt 1000000 non-null float64
# order_id 1000000 non-null int64
# dtypes: datetime64[ns](2), float64(2), int64(3)
# memory usage: 53.4 MB
# None