Tidy Data

tidy data

[2]:
> [...], a stack of elements is a common abstract data type used in computing. We would not think ‘to add’ two stacks as we would two integers.
>> Jeanette Wing - [Computational thinking and thinking about computing][computational thinking]


[computational thinking]: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2696102/

[…], a stack of elements is a common abstract data type used in computing. We would not think ‘to add’ two stacks as we would two integers. > Jeanette Wing - Computational thinking and thinking about computing
[3]:
A modernist style of notebook programming persists where documents are written as if programs are
starting for nothing. Meanwhile, authors of R programming language tend to begin with the assumption
that data exists and so does code. Notebook are a powerful substrate for working with data and
describing the logic behind different permutations.

pidgy was designed to weave projections of tabular into a computational documentation. Specifically,
we are concerned with the DataFrame, a popular tidy data abstraction that serves as a first
class data structure in scientific computing.

A modernist style of notebook programming persists where documents are written as if programs are starting for nothing. Meanwhile, authors of R programming language tend to begin with the assumption that data exists and so does code. Notebook are a powerful substrate for working with data and describing the logic behind different permutations.

pidgy was designed to weave projections of tabular into a computational documentation. Specifically, we are concerned with the DataFrame, a popular tidy data abstraction that serves as a first class data structure in scientific computing.

[4]:
    import pandas as 🐼
import pandas as 🐼
[5]:
<hr/>

[6]:
The figure above illustrates the information in `df`.

A high level numeric project of this data's statistics are:

{{df.describe().to_html()}}

The statistics were created using measurements that look like the following data:

{{df.head(2).to_html()}}

    df = 🐼.DataFrame([range(i, i+4) for i in range(10)], columns=list('abcd'))
    df.plot();
../../_images/docs_examples_working-within-dataframes.md_5_0.png

The figure above illustrates the information in df.

A high level numeric project of this data’s statistics are:

<tr style="text-align: right;">
  <th></th>
  <th>a</th>
  <th>b</th>
  <th>c</th>
  <th>d</th>
</tr>
<tr>
  <th>count</th>
  <td>10.00000</td>
  <td>10.00000</td>
  <td>10.00000</td>
  <td>10.00000</td>
</tr>
<tr>
  <th>mean</th>
  <td>4.50000</td>
  <td>5.50000</td>
  <td>6.50000</td>
  <td>7.50000</td>
</tr>
<tr>
  <th>std</th>
  <td>3.02765</td>
  <td>3.02765</td>
  <td>3.02765</td>
  <td>3.02765</td>
</tr>
<tr>
  <th>min</th>
  <td>0.00000</td>
  <td>1.00000</td>
  <td>2.00000</td>
  <td>3.00000</td>
</tr>
<tr>
  <th>25%</th>
  <td>2.25000</td>
  <td>3.25000</td>
  <td>4.25000</td>
  <td>5.25000</td>
</tr>
<tr>
  <th>50%</th>
  <td>4.50000</td>
  <td>5.50000</td>
  <td>6.50000</td>
  <td>7.50000</td>
</tr>
<tr>
  <th>75%</th>
  <td>6.75000</td>
  <td>7.75000</td>
  <td>8.75000</td>
  <td>9.75000</td>
</tr>
<tr>
  <th>max</th>
  <td>9.00000</td>
  <td>10.00000</td>
  <td>11.00000</td>
  <td>12.00000</td>
</tr>

The statistics were created using measurements that look like the following data:

<tr style="text-align: right;">
  <th></th>
  <th>a</th>
  <th>b</th>
  <th>c</th>
  <th>d</th>
</tr>
<tr>
  <th>0</th>
  <td>0</td>
  <td>1</td>
  <td>2</td>
  <td>3</td>
</tr>
<tr>
  <th>1</th>
  <td>1</td>
  <td>2</td>
  <td>3</td>
  <td>4</td>
</tr>
df = 🐼.DataFrame([range(i, i+4) for i in range(10)], columns=list('abcd'))
df.plot();
[7]:
<hr/>

[8]:
In technical writing we need to consider existing conventions like:
* Figures above captions
* Table below captions

It still remains to be seen where code canonically fits in reference to figures and tables.

[Why should a table caption be placed above the table?]

[Why should a table caption be placed above the table?]: https://tex.stackexchange.com/questions/3243/why-should-a-table-caption-be-placed-above-the-table

In technical writing we need to consider existing conventions like: * Figures above captions * Table below captions

It still remains to be seen where code canonically fits in reference to figures and tables.

Why should a table caption be placed above the table?

[9]:
[notebook war]

[notebook war]: https://yihui.org/en/2018/09/notebook-war/