{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Tidy Data\n", "\n", "[tidy data]\n", "\n", "[tidy data]: https://vita.had.co.nz/papers/tidy-data.pdf" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/markdown": [ "> [...], a stack of elements is a common abstract data type used in computing. We would not think ‘to add’ two stacks as we would two integers.\n", ">> Jeanette Wing - [Computational thinking and thinking about computing][computational thinking]\n", "\n", "\n", "[computational thinking]: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2696102/" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "> [...], a stack of elements is a common abstract data type used in computing. We would not think ‘to add’ two stacks as we would two integers.\n", ">> Jeanette Wing - [Computational thinking and thinking about computing][computational thinking]\n", "\n", "\n", "[computational thinking]: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2696102/\n" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/markdown": [ "A modernist style of notebook programming persists where documents are written as if programs are\n", "starting for nothing. Meanwhile, authors of R programming language tend to begin with the assumption\n", "that data exists and so does code. Notebook are a powerful substrate for working with data and\n", "describing the logic behind different permutations.\n", "\n", "pidgy was designed to weave projections of tabular into a computational documentation. Specifically, \n", "we are concerned with the DataFrame, a popular tidy data abstraction that serves as a first\n", "class data structure in scientific computing." ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "A modernist style of notebook programming persists where documents are written as if programs are\n", "starting for nothing. Meanwhile, authors of R programming language tend to begin with the assumption\n", "that data exists and so does code. Notebook are a powerful substrate for working with data and\n", "describing the logic behind different permutations.\n", "\n", "pidgy was designed to weave projections of tabular into a computational documentation. Specifically, \n", "we are concerned with the DataFrame, a popular tidy data abstraction that serves as a first\n", "class data structure in scientific computing." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/markdown": [ " import pandas as 🐼" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ " import pandas as 🐼" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/markdown": [ "
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "
" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" }, { "data": { "text/markdown": [ "The figure above illustrates the information in `df`.\n", "\n", "A high level numeric project of this data's statistics are:\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
abcd
count10.0000010.0000010.0000010.00000
mean4.500005.500006.500007.50000
std3.027653.027653.027653.02765
min0.000001.000002.000003.00000
25%2.250003.250004.250005.25000
50%4.500005.500006.500007.50000
75%6.750007.750008.750009.75000
max9.0000010.0000011.0000012.00000
\n", "\n", "The statistics were created using measurements that look like the following data:\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
abcd
00123
11234
\n", " \n", " df = 🐼.DataFrame([range(i, i+4) for i in range(10)], columns=list('abcd'))\n", " df.plot();" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "The figure above illustrates the information in `df`.\n", "\n", "A high level numeric project of this data's statistics are:\n", "\n", "{{df.describe().to_html()}}\n", "\n", "The statistics were created using measurements that look like the following data:\n", "\n", "{{df.head(2).to_html()}}\n", " \n", " df = 🐼.DataFrame([range(i, i+4) for i in range(10)], columns=list('abcd'))\n", " df.plot();" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/markdown": [ "
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "
" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/markdown": [ "In technical writing we need to consider existing conventions like:\n", "* Figures above captions\n", "* Table below captions\n", "\n", "It still remains to be seen where code canonically fits in reference to figures and tables.\n", "\n", "[Why should a table caption be placed above the table?]\n", "\n", "[Why should a table caption be placed above the table?]: https://tex.stackexchange.com/questions/3243/why-should-a-table-caption-be-placed-above-the-table" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "In technical writing we need to consider existing conventions like:\n", "* Figures above captions\n", "* Table below captions\n", "\n", "It still remains to be seen where code canonically fits in reference to figures and tables.\n", "\n", "[Why should a table caption be placed above the table?]\n", "\n", "[Why should a table caption be placed above the table?]: https://tex.stackexchange.com/questions/3243/why-should-a-table-caption-be-placed-above-the-table" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/markdown": [ "[notebook war]\n", "\n", "[notebook war]: https://yihui.org/en/2018/09/notebook-war/" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "[notebook war]\n", "\n", "[notebook war]: https://yihui.org/en/2018/09/notebook-war/" ] } ], "metadata": { "kernelspec": { "display_name": "pidgy 3", "language": "python", "name": "pidgy" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.3" } }, "nbformat": 4, "nbformat_minor": 4 }