Notes in Week 6 - Pandas

To Subscribe, use this Key


Status Last Update Fields
Published 11/26/2024 Pandas is often used alongside {{c1::NumPy and SciPy}} for numerical computing and {{c2::Matplotlib}} for data visualization.
Published 11/26/2024 Compared to NumPy, Pandas is designed to work with {{c1::tabular}} or {{c2::heterogeneous}} data.
Published 11/26/2024 In Pandas, the two primary data structures are {{c1::Series}} and {{c2::DataFrame}}.
Published 11/26/2024 A Pandas Series is a {{c1::one-dimensional array-like}} object containing a sequence of values and an associated array of data labels called {{c2::ind…
Published 11/26/2024 A DataFrame is a {{c1::two-dimensional table}} of data with ordered, named columns, each of which can be of a different type.
Published 11/26/2024 A Series can be thought of as a {{c1::fixed-length, ordered dictionary}}, mapping index values to data values.
Published 11/26/2024 The simplest form of a Series is created from only an {{c1::array of data}} without specifying an {{c2::index}}.
Published 11/26/2024 A Series can be converted back to a dictionary using the {{c1::to_dict}} method, which respects the {{c2::order of keys}} in the original dictionary.
Published 11/26/2024 In a Series, missing values are marked with {{c1::NaN}} (Not a Number), representing {{c2::missing}} or {{c3::NA values}}.
Published 11/26/2024 The functions {{c1::isna}} and {{c2::notna}} are used to detect {{c3::missing data}} in Pandas.
Published 11/26/2024 A useful feature of the {{c2::Series}} type is that it automatically aligns by {{c1::index label}} in arithmetic operations.
Published 11/26/2024 A DataFrame can be thought of as a dictionary of {{c1::Series}} all sharing the same {{c2::index}}.
Published 11/26/2024 In a DataFrame, the `head()` method selects the first {{c1::five rows}} by default, while `tail()` selects the last five rows.
Published 11/26/2024 DataFrames allow row selection by {{c1::position}} or name using the `iloc` and `loc` attributes.
Published 11/26/2024 When creating a DataFrame from a dictionary, the resulting DataFrame’s columns are ordered based on the {{c1::insertion order}} of the dictionary keys…
Published 11/26/2024 In DataFrames, rows can be retrieved by position or {{c1::label}}, and columns can be accessed by label.
Published 11/26/2024 To delete columns in a DataFrame, use the {{c1::del keyword}}, similar to deleting keys in a dictionary.
Published 11/26/2024 Assigning a column that doesn’t exist in a DataFrame will {{c1::create a new column}}.
Published 11/26/2024 In Pandas, indexing with square brackets `[ ]` is preferred for selecting {{c1::columns}}.
Published 11/26/2024 The `loc` operator in Pandas is used for {{c1::label-based}} indexing, while `iloc` is used for integer-based indexing.
Published 11/26/2024 Boolean indexing in a DataFrame allows selection based on a specified {{c1::condition}}.
Published 11/26/2024 In reindexing, the `method` option allows filling values using techniques like {{c1::ffill}} for forward-filling.
Published 11/26/2024 The `drop` method is used to {{c1::remove entries}} from a DataFrame axis, such as rows or columns.
Published 11/26/2024 Indexing with integer-based labels can be ambiguous, so using {{c1::loc}} for labels and {{c2::iloc}} for integers is recommended.
Published 11/26/2024 Avoid {{c1::chaining}} indexing when doing assignments in Pandas to prevent unintended results.
Published 11/26/2024 If two objects in Pandas have different indexes during arithmetic, the result will contain the {{c1::union}} of the indexes.
Published 11/26/2024 To avoid NaN values in operations on objects with different indexes, use the `add` method with a specified {{c1::fill_value}}.
Published 11/26/2024 Data alignment in DataFrames occurs on both {{c1::rows}} and {{c2::columns}} during arithmetic.
Published 11/26/2024 In arithmetic operations between a DataFrame and Series, the Series index aligns by default with the DataFrame’s {{c1::columns}}.
Published 11/26/2024 To perform row-wise alignment in DataFrame and Series operations, use {{c1::`axis="index"}}`.
Published 11/26/2024 The `apply` method in DataFrames is used to apply a function to each {{c1::column}} or row, depending on the specified axis.
Published 11/26/2024 Element-wise functions can be applied in DataFrames using {{c1::`applymap`}}.
Published 11/26/2024 To sort data in a DataFrame by row or column label, use the {{c1::sort_index}} method.
Published 11/26/2024 The `sort_values` method sorts a Series by its {{c1::values}}, with missing values typically sorted to the end.
Published 11/26/2024 DataFrames can be sorted by one or multiple columns using {{c1::sort_values}}, specifying the column names.
Published 11/26/2024 The `rank` method assigns ranks starting from the {{c1::lowest value}}, with ties averaged by default.
Published 11/26/2024 Setting `ascending=False` in the `rank` method allows ranking in {{c1::descending order}}.
Published 11/26/2024 The `{{c2::is_unique}}` property of an index indicates whether its labels are {{c1::unique}}.
Published 11/26/2024 In a DataFrame, selecting a duplicate index label returns a {{c1::Series}}; a unique label returns a scalar.
Published 11/26/2024 The `{{c2::sum}}` method on a DataFrame produces a Series with the {{c1::column sums}}.
Published 11/26/2024 Setting {{c2::`axis="columns"`}} in the `sum` method calculates sums across the {{c1::columns}}.
Published 11/26/2024 Setting {{c2::`skipna=True`}} in a reduction method skips {{c1::NA values}} during calculations.
Published 11/26/2024 The {{c2::`corr`}} method calculates the {{c1::correlation}} between aligned, non-NA values in two Series.
Published 11/26/2024 For DataFrames, {{c2::`corr`}} and `cov` return {{c1::full correlation}} or covariance matrices.
Published 11/26/2024 The `{{c2::corrwith}}` method computes pairwise correlations between a DataFrame and another {{c1::Series}} or DataFrame.
Published 11/26/2024 The `unique` method in a Series returns an array of {{c1::unique values}}.
Published 11/26/2024 The `{{c2::value_counts}}` method calculates the {{c1::frequency}} of unique values in a Series.
Published 11/26/2024 The `{{c2::isin}}` function performs a {{c1::membership check}} for filtering data.
Published 11/26/2024 The `DataFrame.value_counts` method counts distinct occurrences of each {{c1::row}} in the DataFrame.
Status Last Update Fields