pandas: powerful Python data analysis toolkit - 0.12original object. Suppose we want to take only elements that belong to groups with a group sum greater than 2. In [36]: sf = Series([1, 1, 2, 3, 3, 3]) In [37]: sf.groupby(sf).filter(lambda x: x.sum() > 2) are attempting to append an index with a different frequency than the existing, or attempting to append an index with a different name than the existing – support datelike columns with a timezone as data_columns behavior depending on whether the slice is interpreted as position based or label based, it’s usually better to be explicit and use .iloc or .loc. See more at Advanced Indexing, Advanced Hierarchical and Fallback0 码力 | 657 页 | 3.58 MB | 1 年前3
pandas: powerful Python data analysis toolkit - 0.13.1underlying file handle is_open; a closed store will now report ‘CLOSED’ when viewing the store (rather than raising an error) (GH4409) • a close of a HDFStore now will close that instance of the HDFStore but representations of DataFrame now show a truncated view of the table once it exceeds a certain size, rather than switching to the short info view (GH4886, GH5550). This makes the representation more consistent as NaN dtype: object Elements that do not match return NaN. Extracting a regular expression with more than one group returns a DataFrame with one column per group. In [87]: Series([’a1’, ’b2’, ’c3’]).str0 码力 | 1219 页 | 4.81 MB | 1 年前3
pandas: powerful Python data analysis toolkit - 0.14.0indexer is out-of-bounds • Slicing with negative start, stop & step values handles corner cases better (GH6531): – df.iloc[:-len(df)] is now empty – df.iloc[len(df)::-1] now enumerates all elements 145447 C 0.058945 0.335350 0.390637 • Series.iteritems() is now lazy (returns an iterator rather than a list). This was the documented behavior 6 Chapter 1. What’s New pandas: powerful Python data analysis be interpreted as the levels of the index, rather than requiring a list of tuple (GH4370) • all offset operations now return Timestamp types (rather than datetime), Business/Week frequencies were incorrect0 码力 | 1349 页 | 7.67 MB | 1 年前3
pandas: powerful Python data analysis toolkit - 0.17.017.0 • to_datetime can now accept the yearfirst keyword (GH7599) • pandas.tseries.offsets larger than the Day offset can now be used with a Series for addi- tion/subtraction (GH10699). See the docs for None Boolean comparisons of a Series vs None will now be equivalent to comparing with np.nan, rather than raise TypeError. (GH1079). In [71]: s = Series(range(3)) In [72]: s.iloc[1] = None 18 Chapter 1 (GH10451). Earlier versions of pandas would format floating point numbers to have one less decimal place than the value in display.precision. In [1]: pd.set_option('display.precision', 2) In [2]: pd.DataFrame({'x':0 码力 | 1787 页 | 10.76 MB | 1 年前3
pandas: powerful Python data analysis toolkit - 0.15in MultiIndex beyond lex-sort depth is now supported, though a lexically sorted index will have a better performance. (GH2646) In [1]: df = pd.DataFrame({’jim’:[0, 0, 1, 1], ...: ’joe’:[’x’, ’x’, ’z’ • Timestamp(’now’) is now equivalent to Timestamp.now() in that it returns the local time rather than UTC. Also, Timestamp(’today’) is now equivalent to Timestamp.today() and both have tz as a possible exception (either tz operated with None or incompatible timezone), will now return TypeError rather than ValueError (a couple of edge cases only), (GH8865) • Bug in using a pd.Grouper(key=...) with no level/axis0 码力 | 1579 页 | 9.15 MB | 1 年前3
pandas: powerful Python data analysis toolkit - 0.15.1API changes • s.dt.hour and other .dt accessors will now return np.nan for missing values (rather than previously -1), (GH8689) In [1]: s = Series(date_range(’20130101’,periods=5,freq=’D’)) In [2]: s break the entire response. (GH8482) • Added option to Series.str.split() to return a DataFrame rather than a Series (GH8428) • Added option to df.info(null_counts=None|True|False) to override the default sub-class ndarray, see Internal Refactoring – dropping support for PyTables less than version 3.0.0, and numexpr less than version 2.1 (GH7990) – Split indexing documentation into Indexing and Selecting0 码力 | 1557 页 | 9.10 MB | 1 年前3
pandas: powerful Python data analysis toolkit - 0.7.1functions significantly sped up by clever manipulation of the ndarray data type in Cython (GH496). • Better error message in DataFrame constructor when passed column labels don’t match data (GH497) • Substantially SparseArray and SparseList data structures. SparseSeries now derives from SparseArray (GH463) • Better console printing options (PR453) • Implement fast data ranking for Series and DataFrame, fast versions DataFrame.from_items alternate constructor (GH444) • DataFrame.convert_objects method for inferring better dtypes for object columns (GH302) • Add rolling_corr_pairwise function for computing Panel of correlation0 码力 | 281 页 | 1.45 MB | 1 年前3
pandas: powerful Python data analysis toolkit - 0.7.2functions significantly sped up by clever manipulation of the ndarray data type in Cython (GH496). • Better error message in DataFrame constructor when passed column labels don’t match data (GH497) • Substantially SparseArray and SparseList data structures. SparseSeries now derives from SparseArray (GH463) • Better console printing options (PR453) • Implement fast data ranking for Series and DataFrame, fast versions DataFrame.from_items alternate constructor (GH444) • DataFrame.convert_objects method for inferring better dtypes for object columns (GH302) • Add rolling_corr_pairwise function for computing Panel of correlation0 码力 | 283 页 | 1.45 MB | 1 年前3
pandas: powerful Python data analysis toolkit - 0.7.3functions significantly sped up by clever manipulation of the ndarray data type in Cython (GH496). • Better error message in DataFrame constructor when passed column labels don’t match data (GH497) • Substantially SparseArray and SparseList data structures. SparseSeries now derives from SparseArray (GH463) • Better console printing options (PR453) • Implement fast data ranking for Series and DataFrame, fast versions DataFrame.from_items alternate constructor (GH444) • DataFrame.convert_objects method for inferring better dtypes for object columns (GH302) • Add rolling_corr_pairwise function for computing Panel of correlation0 码力 | 297 页 | 1.92 MB | 1 年前3
pandas: powerful Python data analysis toolkit - 0.21.1Groupby Enhancements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 1.5.1.5 Better support for compressed URLs in read_csv . . . . . . . . . . . . . . . . . 42 1.5.1.6 Pickle file . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 423 4.1.1 Why more than one data structure? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 423 4.2 Mutability datetime.datetime and a datetime64[ns] dtype Series (GH17965) • Bug where a MultiIndex with more than a million records was not raising AttributeError when trying to access a missing attribute (GH18165)0 码力 | 2207 页 | 8.59 MB | 1 年前3
共 32 条
- 1
- 2
- 3
- 4













