pandas: powerful Python data analysis toolkit - 0.25decoded to unicode in the result: In [77]: data = (b'word,length\n' ....: b'Tr\xc3\xa4umen,7\n' ....: b'Gr\xc3\xbc\xc3\x9fe,5') ....: In [78]: data = data.decode('utf8').encode('latin-1') In [79]: df = pd read_csv(BytesIO(data), encoding='latin-1') In [80]: df Out[80]: word length 0 Träumen 7 1 GrüSSe 5 In [81]: df['word'][1] Out[81]: 'GrüSSe' Some formats which encode all characters as multiple bytes, like UTF-16 'UK', 'GR', 'JP']) In [109]: key = countries[np.random.randint(0, 4, 1000)] In [110]: grouped = data_df.groupby(key) # Non-NA count in each group In [111]: grouped.count() Out[111]: A B C GR 244 2560 码力 | 698 页 | 4.91 MB | 1 年前3
pandas: powerful Python data analysis toolkit - 0.12non-null values C 820 non-null values dtypes: float64(3) In [69]: countries = np.array([’US’, ’UK’, ’GR’, ’JP’]) In [70]: key = countries[np.random.randint(0, 4, 1000)] In [71]: grouped = data_df.groupby(key) data analysis toolkit, Release 0.12.0 # Non-NA count in each group In [72]: grouped.count() A B C GR 219 223 194 JP 238 250 211 UK 228 239 213 US 223 241 202 In [73]: f = lambda x: x.fillna(x.mean()) [75]: grouped_trans = transformed.groupby(key) In [76]: grouped.mean() # original group means A B C GR 0.093655 -0.004978 -0.049883 JP -0.067605 0.025828 0.006752 UK -0.054246 0.031742 0.068974 US 00 码力 | 657 页 | 3.58 MB | 1 年前3
pandas: powerful Python data analysis toolkit - 0.24.0to unicode in the result: In [77]: data = (b'word,length\n' ....: b'Tr\xc3\xa4umen,7\n' ....: b'Gr\xc3\xbc\xc3\x9fe,5') ....: In [78]: data = data.decode('utf8').encode('latin-1') In [79]: df = pd [80]: df Out[80]: word length 0 Träumen 7 1 Grüße 5 In [81]: df['word'][1] \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\Out[81]: 'Grüße' Some formats which encode all characters 'UK', 'GR', 'JP']) In [104]: key = countries[np.random.randint(0, 4, 1000)] In [105]: grouped = data_df.groupby(key) # Non-NA count in each group In [106]: grouped.count() Out[106]: A B C GR 209 2170 码力 | 2973 页 | 9.90 MB | 1 年前3
pandas: powerful Python data analysis toolkit - 0.154 9 In [12]: gr = df.groupby(df[’jim’] < 2) previous behavior (excludes 1st column from output): In [4]: gr.apply(sum) Out[4]: joe jim False 24 True 11 current behavior: In [13]: gr.apply(sum) Out[13]: ’UK’, ’GR’, ’JP’]) In [76]: key = countries[np.random.randint(0, 4, 1000)] In [77]: grouped = data_df.groupby(key) # Non-NA count in each group In [78]: grouped.count() Out[78]: A B C GR 209 217 grouped_trans = transformed.groupby(key) In [82]: grouped.mean() # original group means Out[82]: A B C GR -0.098371 -0.015420 0.068053 JP 0.069025 0.023100 -0.077324 UK 0.034069 -0.052580 -0.116525 US 00 码力 | 1579 页 | 9.15 MB | 1 年前3
pandas: powerful Python data analysis toolkit - 0.15.14 9 In [12]: gr = df.groupby(df[’jim’] < 2) previous behavior (excludes 1st column from output): In [4]: gr.apply(sum) Out[4]: joe jim False 24 True 11 current behavior: In [13]: gr.apply(sum) Out[13]: ’UK’, ’GR’, ’JP’]) In [74]: key = countries[np.random.randint(0, 4, 1000)] In [75]: grouped = data_df.groupby(key) # Non-NA count in each group In [76]: grouped.count() Out[76]: A B C GR 209 217 grouped_trans = transformed.groupby(key) In [80]: grouped.mean() # original group means Out[80]: A B C GR -0.098371 -0.015420 0.068053 JP 0.069025 0.023100 -0.077324 UK 0.034069 -0.052580 -0.116525 US 00 码力 | 1557 页 | 9.10 MB | 1 年前3
pandas: powerful Python data analysis toolkit - 0.13.1’UK’, ’GR’, ’JP’]) In [73]: key = countries[np.random.randint(0, 4, 1000)] In [74]: grouped = data_df.groupby(key) # Non-NA count in each group In [75]: grouped.count() Out[75]: A B C GR 209 217 grouped_trans = transformed.groupby(key) In [79]: grouped.mean() # original group means Out[79]: A B C GR -0.098371 -0.015420 0.068053 JP 0.069025 0.023100 -0.077324 UK 0.034069 -0.052580 -0.116525 US 0 columns] In [80]: grouped_trans.mean() # transformation did not change group means Out[80]: A B C GR -0.098371 -0.015420 0.068053 JP 0.069025 0.023100 -0.077324 13.4. Transformation 333 pandas: powerful0 码力 | 1219 页 | 4.81 MB | 1 年前3
pandas: powerful Python data analysis toolkit - 1.0.0[79]: from io import BytesIO In [80]: data = (b'word,length\n' ....: b'Tr\xc3\xa4umen,7\n' ....: b'Gr\xc3\xbc\xc3\x9fe,5') ....: In [81]: data = data.decode('utf8').encode('latin-1') In [82]: df = pd read_csv(BytesIO(data), encoding='latin-1') In [83]: df Out[83]: word length 0 Träumen 7 1 Grüße 5 In [84]: df['word'][1] Out[84]: 'Grüße' Some formats which encode all characters as multiple bytes, like UTF-16, 'UK', 'GR', 'JP']) In [109]: key = countries[np.random.randint(0, 4, 1000)] In [110]: grouped = data_df.groupby(key) # Non-NA count in each group In [111]: grouped.count() Out[111]: A B C GR 210 2160 码力 | 3015 页 | 10.78 MB | 1 年前3
pandas: powerful Python data analysis toolkit - 0.25.0to unicode in the result: In [77]: data = (b'word,length\n' ....: b'Tr\xc3\xa4umen,7\n' ....: b'Gr\xc3\xbc\xc3\x9fe,5') ....: In [78]: data = data.decode('utf8').encode('latin-1') In [79]: df = pd [80]: df Out[80]: word length 0 Träumen 7 1 Grüße 5 In [81]: df['word'][1] \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\Out[81]: 'Grüße' Some formats which encode all characters 'UK', 'GR', 'JP']) In [109]: key = countries[np.random.randint(0, 4, 1000)] In [110]: grouped = data_df.groupby(key) # Non-NA count in each group In [111]: grouped.count() Out[111]: A B C GR 226 2330 码力 | 2827 页 | 9.62 MB | 1 年前3
pandas: powerful Python data analysis toolkit - 0.25.1to unicode in the result: In [77]: data = (b'word,length\n' ....: b'Tr\xc3\xa4umen,7\n' ....: b'Gr\xc3\xbc\xc3\x9fe,5') ....: In [78]: data = data.decode('utf8').encode('latin-1') In [79]: df = pd [80]: df Out[80]: word length 0 Träumen 7 1 Grüße 5 In [81]: df['word'][1] \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\Out[81]: 'Grüße' Some formats which encode all characters 'UK', 'GR', 'JP']) In [109]: key = countries[np.random.randint(0, 4, 1000)] In [110]: grouped = data_df.groupby(key) # Non-NA count in each group In [111]: grouped.count() Out[111]: A B C GR 240 2590 码力 | 2833 页 | 9.65 MB | 1 年前3
pandas: powerful Python data analysis toolkit - 0.17.04 9 In [12]: gr = df.groupby(df['jim'] < 2) previous behavior (excludes 1st column from output): In [4]: gr.apply(sum) Out[4]: joe jim False 24 True 11 current behavior: In [13]: gr.apply(sum) Out[13]: 'UK', 'GR', 'JP']) In [79]: key = countries[np.random.randint(0, 4, 1000)] In [80]: grouped = data_df.groupby(key) # Non-NA count in each group In [81]: grouped.count() Out[81]: A B C GR 209 217 grouped_trans = transformed.groupby(key) In [85]: grouped.mean() # original group means Out[85]: A B C GR -0.098371 -0.015420 0.068053 JP 0.069025 0.023100 -0.077324 UK 0.034069 -0.052580 -0.116525 US 00 码力 | 1787 页 | 10.76 MB | 1 年前3
共 29 条
- 1
- 2
- 3













