pidgy programming¶
pidgy treats code as literature and programming as a literacy. It is an interactive programming workflow in Markdown that allows narrative and code to develop together.
pidgy has literary and computational qualities that:
Publish documentation and render PDFs using the ReadTheDocs service.
Make it installable from pip and conda.
pip install pidgy
Formally test the literature and source code with Github Actions.
Reusable on Binder.
Import alternative source files into python like notebooks and markdown.
The pidgy
package and paper¶
pidgy
is a fun way to program in [Markdown] in your favorite IDE (jupyter, nteract, colab, vscode) that can be reused as python modules, scripts, and applications.
Binder
Documentation Status
Python package
PyPI - Python Version
pip install pidgy # Install pidgy
pidgy
has a few components to it:
- It is an interactive [Literate Computing] implementation of
IPython
- A specification of a potentially polyglot approach for literate programming applied to other languages.
- A complete unit of computable scientific literate. It is written in a literate programming style with the literature as the primary outcome. Read the
pidgy
paper.
The pidgy shell and kernel¶
pidgy
can be used as a native jupyter
kernel in Jupyter, nteract, colab, and vscode. Install the kernel with
pidgy kernel install # install the pidgy kernel.
Or, in your standard Python shell, load the pidgy
IPython
extension.
Importing pidgy
documents¶
pidgy
uses the importnb
machinery to import files into [Python] that are not native ".py"
files.
import pidgy
with pidgy.pidgyLoader(): ...
The pidgy
CLI¶
Usage: pidgy [OPTIONS] COMMAND [ARGS]...
Options:
--help Show this message and exit.
Commands:
kernel
render
run `pidgy` `run` makes it possible to execute `pidgy` documents as...
template
test Formally test markdown documents, notebooks, and python files.
to
pidgy paper¶
Introduction¶
A contemporary challenge for science is reprogramming literature for an information-rich scientific economy. Recently, science is troubled with an inability to reproduce scientific results from their published methods. This reproducibility crisis highlights the inability of natural language to bring meaning to data. In response, open science communities demonstrate that programming languages can supplement natural language as they are better equipped to communicate meaning in data. These progressive workflows disrupt traditional scientific publications with hypertext and hypermedia that are unbound from the constraints of physical media. These changes mean that modern scientific communications will necessitate different literacies when computability becomes a formal feature of scientific literature.
“Notebooks” have been adopted by differently-abled scientists and authors across disciplines to share stories supplemented with data and code. The availability of millions of open source notebooks demonstrates the potential of interleaved narrative, code, and hypermedia in computational thinking. The Jupyter Notebook ecosystem allows once-isolated scientific experiences to be shared in reproducible units through polyglot scientific languages. While the peer review process still requires a human reviewer to measure literary value, computational veracity can be automatically measured when an author adopts software engineering best practices.
Literate programming is a prescient, multi-lingual style of documentation that approaches programs as literary works. Minimally, a literate program is constrained to a document format and a programming language. Respectively, Donald Knuth’s WEB implementation chose TeX and Pascal, while Jupyter notebooks use a pidgin of Markdown, TeX mathematical notation, tables, and syntax highlighting, and accept over 100 different programming languages. In a 2013 essay, Fernando Perez, author of IPython, describes literate computing as weaving interactive computation into scientific narratives, a different concern than the literary and computational qualities of the literate program. A strict separation of these concerns creates inconsistencies, both for the author and reader, between the document and the program.
pidgy
bridges literate programming and computing interactive programming experience that assigns Markdown as the primary programming language. The literate computing read-eval-print-loop allows composing documentation and software concurenntly. pidgy
modifies the customizable IPython
shell and kernel to:
- before execution, tangle code line-for-line from Markdown to valid IPython syntax.
- after execution, the input is woven into different representations:
- the outcome of formal
unittest and doctest
. - template the Markdown using
jinja2
syntax and show a rich Markdown display.
- the outcome of formal
Documents are composed interactively, and the introduction of literate program conventions for the REPL helps the ensure literary and computational integrity throughout the document. The pidgy
is implemented as a literate program such that the pidgy
paper and module are derived from the same sources, similarly, pidgy
source code is used on the formal testing. pidgy
demonstrates the flexability of literate code on different continuous integration systems for testing on Github Actions, publishing on Read the Docs, and packaging Pypi.
pidgy specification¶
The pidgy
literate computing shell¶
https://ipython.readthedocs.io/en/stable/development/execution.html#execution-semantics
A powerful feature of the jupyter
ecosystem is a generalized implementation of the Shell & Kernel model for interactive computing in interfaces like the terminal and notebooks. That is to say that different programming languages can use the same interfaces, jupyter
supports over 100 languages now. The general ability to support different languages is possible because of configurable interfaces for the IPython.InteractiveShell
and ipykernel
.
import ipykernel.kernelapp, ipykernel.zmqshell, nbconvert, traitlets, pidgy, types, pluggy, IPython, jinja2
class pidgyShell(ipykernel.zmqshell.ZMQInteractiveShell):
The pidgy
shell is wrapper around the existing IPython
shell experience. It explicitly defines the tangle and weave conventions of literate programming to each interactive computing execution. Once the shell is configured, it can be reused as a jupyter
kernel or IPython
extension that supports the pidgy
[Markdown]/[IPython] metalanguage and metasyntax.
environment = traitlets.Any(nbconvert.exporters.TemplateExporter().environment)
pidgy
specification¶
@pidgy.specification(firstresult=True)
def tangle(str:str)->str:
The tangle
step operates on an input string that will become compiled source code. In a literate program, the source is written primarily in the documentation language and tangling converts it to the programming language. In pidgy
, the tangle steps target valid IPython
which is a superset of [Python], and requires further processing.
input_transformers_post = traitlets.List([pidgy.tangle.demojize])
pidgy
includes the ability the use emojis as valid python names through the existing traitlets
configuration system.
class pidgyManager(IPython.core.inputtransformer2.TransformerManager):
def transform_cell(self, cell):
shell = IPython.get_ipython()
return super(type(self), self).transform_cell(
(shell and hasattr(shell, 'manager') and shell.manager.hook.tangle)(str=cell))
input_transformer_manager = traitlets.Instance(pidgyManager, args=tuple())
ast_transformers = traitlets.List([pidgy.tangle.ExtraSyntax(), pidgy.testing.Definitions()])
Another feature of IPython
is the ability to intercept [Abstract Syntax Tree]s and change their representation or capture metadata. After these transformations are applied, IPython
compile the tree into a valid types.CodeType
.
@pidgy.specification
def post_execute(self):
...
@pidgy.specification
def post_run_cell(self, result):
The weave step happens after execution, the tangle step happens before. Weaving only occurs if the input is computationally verified. It allows different representations of the input to be displayed. pidgy
will implement templated Markdown displays of the input and formally test the contents of the input.
def _post_run_cell(self, result):
self.manager.hook.post_run_cell(result=result)
def _post_exec(self):
self.manager.hook.post_execute()
enable_html_pager = traitlets.Bool(True)
definitions = traitlets.List()
manager = traitlets.Instance('pluggy.PluginManager', args=('pidgy',))
loaders = traitlets.Dict()
weave = traitlets.Any()
@traitlets.default('weave')
def _default_weave(self): return pidgy.weave.Weave(self)
pidgy
mixes the standard IPython
configuration system and its own pluggy
specification and implementation.
Initializing the pidgy
shell¶
def init_pidgy(self):
Initialize pidgy
specific behaviors.
self.manager.add_hookspecs(pidgyShell)
for object in (
pidgy.tangle, self.weave, pidgy.testing
):
The tangle and weave implementations are discussed in other parts of this document. Here we register each of them as pluggy
hook implementations.
self.manager.register(object)
self.events.register("post_run_cell", types.MethodType(pidgyShell._post_run_cell, self))
self.events.register("post_execute", types.MethodType(pidgyShell._post_exec, self))
if pidgy.pidgyLoader not in self.loaders:
pidgy
enters a loader context allowing [Markdown] and notebook files to be used permissively as input.
self.loaders[pidgy.pidgyLoader] = pidgy.pidgyLoader().__enter__()
It also adds a few extra features to the shell.
self.user_ns["shell"] = self
self.user_ns.update({k: v for k, v in vars(IPython.display).items()
if pidgy.util.istype(v, IPython.core.display.DisplayObject)
})
and allows json syntax as valid python input.
pidgy.tangle.init_json()
pidgy.magic.load_ipython_extension(self)
def __init__(self, *args, **kwargs):
Override the initialization of the conventional IPython kernel to include the pidgy opinions.
super().__init__(*args, **kwargs)
self.init_pidgy()
pidgy
extension.¶
def load_ipython_extension(shell):
The pidgy kernel makes it easy to access the pidgy shell, but it can also be used an IPython extension.
shell.add_traits(manager=pidgyShell.manager, loaders=pidgyShell.loaders, definitions=pidgyShell.definitions, weave=pidgyShell.weave)
shell._post_run_cell = types.MethodType(pidgyShell._post_run_cell, shell)
shell._post_exec = types.MethodType(pidgyShell._post_exec, shell)
pidgyShell.init_pidgy(shell)
shell.input_transformer_manager = pidgyShell.input_transformer_manager.default_value
def unload_ipython_extension(self):
self.events.unregister("post_run_cell", self._post_run_cell)
self.events.unregister("post_run_cell", pidgy.weave.post_run_cell)
loader = self.loaders.pop(pidgy.pidgyLoader)
if loader is not None:
loader.__exit__(None, None, None)
load_ipython_extension = pidgyShell.load_ipython_extension
unload_ipython_extension = pidgyShell.unload_ipython_extension
pidgy implementation¶
Translating Markdown to Python¶
A primary translation is literate programming is the tangle step that converts the literate program into the programming language. The 1979 implementation converts ".WEB"
files to valid pascal - ".PAS"
- files. The pidgy
approach begins with Markdown files and proper Python files as the outcome. The rest of this document configures how [IPython] acknowledges the transformation and the heuristics the translate Markdown to Python.
[1]:
import typing, mistune, IPython, pidgy.util
__all__ = 'tangle', 'Tangle'
The pidgy
tangle workflow has three steps:
- Block-level lexical analysis to tokenize Markdown.
- Normalize the tokens to compacted
"code" and not "code"
tokens. - Translate the normalized tokens to a string of valid Python code.
[2]:
@pidgy.implementation
def tangle(str:str)->str:
translate = Tangle()
return translate.stringify(translate.parse(''.join(str)))
[3]:
class pidgyManager(IPython.core.inputtransformer2.TransformerManager):
def transform_cell(self, cell): return super(type(self), self).transform_cell(tangle(str=cell))
Block level lexical analysis.¶
pidgy
uses a modified mistune.BlockLexer
to create block level tokens for a [Markdown] source. A specific pidgy
addition is the addition off a doctest
block object, doctest
are testable strings that are ignored by the tangle step. The tokens are to be normalized and translated to [Python] strings.
BlockLexer
[4]:
class BlockLexer(mistune.BlockLexer, pidgy.util.ContextDepth):
class grammar_class(mistune.BlockGrammar):
doctest = __import__('doctest').DocTestParser._EXAMPLE_RE
block_code = __import__('re').compile(r'^((?!\s+>>>\s) {4}[^\n]+\n*)+')
default_rules = "newline hrule block_code fences heading nptable lheading block_quote list_block def_links def_footnotes table paragraph text".split()
def parse_doctest(self, m): self.tokens.append({'type': 'paragraph', 'text': m.group(0)})
def parse_fences(self, m):
if m.group(2): self.tokens.append({'type': 'paragraph', 'text': m.group(0)})
else: super().parse_fences(m)
def parse_hrule(self, m): self.tokens.append(dict(type='hrule', text=m.group(0)))
def parse_def_links(self, m):
super().parse_def_links(m)
self.tokens.append(dict(type='def_link', text=m.group(0)))
def parse_front_matter(self): ...
def parse(self, text: str, default_rules=None, normalize=True) -> typing.List[dict]:
front_matter = None
if not self.depth:
self.tokens = []
if text.strip() and text.startswith('---\n') and '\n---\n' in text[4:]:
front_matter, sep, text = text[4:].partition('---\n')
front_matter = {'type': 'front_matter', 'text': F"\n{front_matter}"}
with self: tokens = super().parse(pidgy.util.whiten(text), default_rules)
if normalize and not self.depth: tokens = normalizer(text, tokens)
if front_matter: tokens.insert(0, front_matter)
return tokens
Normalizing the tokens¶
Tokenizing [Markdown] typically extracts conventions at both the block and inline level. Fortunately, pidgy
’s translation is restricted to block level [Markdown] tokens, and mitigating some potential complexities from having opinions about inline code while tangling.
normalizer
[5]:
def normalizer(text, tokens):
compacted = []
while tokens:
token = tokens.pop(0)
if 'text' not in token: continue
if not token['text'].strip(): continue
block, body = token['text'].splitlines(), ""
while block:
line = block.pop(0)
if line:
before, line, text = text.partition(line)
body += before + line
if token['type']=='code':
compacted.append({'type': 'code', 'lang': None, 'text': body})
elif compacted and compacted[-1]['type'] == 'paragraph':
compacted[-1]['text'] += body
else: compacted.append({'type': 'paragraph', 'text': body})
if compacted and compacted[-1]['type'] == 'paragraph':
compacted[-1]['text'] += text
elif text.strip():
compacted.append({'type': 'paragraph', 'text': text})
# Deal with front matter
if compacted and compacted[0]['text'].startswith('---\n') and '\n---' in compacted[0]['text'][4:]:
token = compacted.pop(0)
front_matter, sep, paragraph = token['text'][4:].partition('---')
compacted = [{'type': 'front_matter', 'text': F"\n{front_matter}"},
{'type': 'paragraph', 'text': paragraph}] + compacted
return compacted
Flattening the tokens to a [Python] string.¶
The tokenizer controls the translation of markdown strings to python strings. Our major constraint is that the Markdown input should retain line numbers.
Flatten
[6]:
class Tangle(BlockLexer):
def stringify(self, tokens: typing.List[dict], source: str = """""", last: int =0) -> str:
import textwrap
INDENT = indent = pidgy.util.base_indent(tokens) or 4
for i, token in enumerate(tokens):
object = token['text']
if token and token['type'] == 'code':
if object.lstrip().startswith(pidgy.util.FENCE):
object = ''.join(''.join(object.partition(pidgy.util.FENCE)[::2]).rpartition(pidgy.util.FENCE)[::2])
indent = INDENT + pidgy.util.num_first_indent(object)
object = textwrap.indent(object, INDENT*pidgy.util.SPACE)
if object.lstrip().startswith(pidgy.util.MAGIC): ...
else: indent = pidgy.util.num_last_indent(object)
elif token and token['type'] == 'front_matter':
object = textwrap.indent(
F"locals().update(__import__('ruamel.yaml').yaml.safe_load({pidgy.util.quote(object)}))\n", indent*pidgy.util.SPACE)
elif not object: ...
else:
object = textwrap.indent(object, pidgy.util.SPACE*max(indent-pidgy.util.num_first_indent(object), 0))
for next in tokens[i+1:]:
if next['type'] == 'code':
next = pidgy.util.num_first_indent(next['text'])
break
else: next = indent
Δ = max(next-indent, 0)
if not Δ and source.rstrip().rstrip(pidgy.util.CONTINUATION).endswith(pidgy.util.COLON):
Δ += 4
spaces = pidgy.util.indents(object)
"what if the spaces are ling enough"
object = object[:spaces] + Δ*pidgy.util.SPACE+ object[spaces:]
if not source.rstrip().rstrip(pidgy.util.CONTINUATION).endswith(pidgy.util.QUOTES):
object = pidgy.util.quote(object)
source += object
# add a semicolon to the source if the last block is code.
for token in reversed(tokens):
if token['text'].strip():
if token['type'] != 'code':
source = source.rstrip() + pidgy.util.SEMI
break
return source
Append the lexer for nested rules.
[7]:
for x in "default_rules footnote_rules list_rules".split():
setattr(BlockLexer, x, list(getattr(BlockLexer, x)))
getattr(BlockLexer, x).insert(getattr(BlockLexer, x).index('block_code'), 'doctest')
if 'block_html' in getattr(BlockLexer, x):
getattr(BlockLexer, x).pop(getattr(BlockLexer, x).index('block_html'))
del x
More pidgy
langauge features¶
pidgy
experiments extra language features for python, using the same system that IPython uses to add features like line and cell magics.
[1]:
import ast, pidgy, IPython
Recently, IPython introduced a convention that allows top level await statements outside of functions. Building of this convenience, pidgy
allows for top-level return and yield statements. These statements are replaced with the an IPython display statement.
[2]:
class ExtraSyntax(ast.NodeTransformer):
def visit_FunctionDef(self, node): return node
visit_AsyncFunctionDef = visit_FunctionDef
def visit_Return(self, node):
replace = ast.parse('''__import__('IPython').display.display()''').body[0]
replace.value.args = node.value.elts if isinstance(node.value, ast.Tuple) else [node.value]
return ast.copy_location(replace, node)
def visit_Expr(self, node):
if isinstance(node.value, (ast.Yield, ast.YieldFrom)): return ast.copy_location(self.visit_Return(node.value), node)
return node
visit_Expression = visit_Expr
We know naming is hard, there is no point focusing on it. pidgy
allows authors to use emojis as variables in python. They add extra color and expression to the narrative.
[3]:
def demojize(lines, delimiters=('_', '_')):
str = ''.join(lines)
import tokenize, emoji, stringcase; tokens = []
try:
for token in list(tokenize.tokenize(
__import__('io').BytesIO(str.encode()).readline)):
if token.type == tokenize.ERRORTOKEN:
string = emoji.demojize(token.string, delimiters=delimiters
).replace('-', '_').replace("’", "_")
if tokens and tokens[-1].type == tokenize.NAME: tokens[-1] = tokenize.TokenInfo(tokens[-1].type, tokens[-1].string + string, tokens[-1].start, tokens[-1].end, tokens[-1].line)
else: tokens.append(
tokenize.TokenInfo(
tokenize.NAME, string, token.start, token.end, token.line))
else: tokens.append(token)
return tokenize.untokenize(tokens).decode()
except BaseException: raise SyntaxError(str)
[ ]:
def init_json():
import builtins
builtins.yes = builtins.true = True
builtins.no = builtins.false = False
builtins.null = None
Import Markdown files and notebooks¶
Literate pidgy
programs are reusable as [Python] scripts and modules. These features are configured by inheriting features from importnb
that customize the [Python] import system to discover/load alternative source files. pidgy
treats [Python], [Markdown], and [Notebook] files as python source.
sys.meta_path and sys.path_hooks
[1]:
__all__ = 'pidgyLoader',
import pidgy, IPython, importnb
get_data
determines how a file is decoding from disk. We use it to make an escape hatch for markdown files otherwise we are importing a notebook.
[2]:
def get_data(self, path):
if self.path.endswith('.md'): return self.code(self.decode())
return super(pidgyLoader, self).get_data(path)
The code
method tangles the [Markdown] to [Python] before compiling to an [Abstract Syntax Tree].
[3]:
def code(self, str):
for callable in (self.transformer_manager.transform_cell,
pidgy.tangle.demojize):
str = ''.join(callable(''.join(str)))
return str
The visit
method allows custom [Abstract Syntax Tree] transformations to be applied.
[4]:
def visit(self, node):
return pidgy.tangle.ExtraSyntax().visit(node)
Attach these methods to the pidgy
loader.
Only [Python] files and common flavored notebooks may be used as source code before the pidgyLoader
is defined. Once the pidgyLoader
is defined [Markdown] becomes a new source target for [Python] and [Notebook]s bearing the ".md.ipynb"
extension are consumed specially as pidgy
flavored documents.
[5]:
class pidgyLoader(importnb.Notebook):
extensions = ".py.md .md .md.ipynb".split()
transformer_manager = pidgy.tangle.pidgyManager()
code = code
visit = visit
get_source = get_data = get_data
[ ]:
Render and template output source¶
In literate programming, the input is representative of a published form. The original target for the WEB programming
implementation is the Device Independent Format used by Latex, and with the ability to target PDF. [Markdown] is
the pidgy
document language. It is a plain text formatting syntax that has canonical representations in HTML.
An important feature of interactive computing in the browser is access to rich display object provided by
HTML and Javascript. pidgy
takes advantage of the ability to include hypermedia forms that enhance and
support computational narratives.
import dataclasses, IPython, pidgy
@dataclasses.dataclass(unsafe_hash=True)
class Weave:
exporter = __import__('nbconvert').exporters.TemplateExporter()
The Weave
class controls the display of pidgy
outputs.
shell: object
@pidgy.implementation
def post_run_cell(self, result):
Show the woven output.
text = pidgy.util.strip_front_matter(result.info.raw_cell)
lines = text.splitlines() or ['']
if not lines[0].strip(): return pidgy.util.html_comment(text)
IPython.display.display(IPython.display.Markdown(self.render(text)))
def render(self, text):
import builtins, operator
try:
Try to replace any jinja templates with information in the current namespace and show the rendered view.
template = self.exporter.environment.from_string(text, globals={
**vars(builtins), **vars(operator),
**getattr(self.shell, 'user_ns', {})})
text = template.render()
except BaseException as Exception:
IPython.get_ipython().showtraceback((type(Exception), Exception, Exception.__traceback__))
return text
Interactive formal testing¶
Testing is something we added because of the application of notebooks as test units.
A primary use case of notebooks is to test ideas. Typically this in informally using manual validation to qualify the efficacy of narrative and code. To ensure testable literate documents we formally test code incrementally during interactive computing.
def make_test_suite(*objects: typing.Union[
unittest.TestCase, types.FunctionType, str
], vars, name) -> unittest.TestSuite:
The interactive testing suite execute doctest and unittest
conventions
for a flexible interface to verifying the computational qualities of literate programs.
suite, doctest_suite = unittest.TestSuite(), doctest.DocTestSuite()
for object in objects:
if isinstance(object, type) and issubclass(object, unittest.TestCase):
suite.addTest(unittest.defaultTestLoader.loadTestsFromTestCase(object))
elif isinstance(object, str):
doctest_suite.addTest(doctest.DocTestCase(
doctest.DocTestParser().get_doctest(object, vars, name, name, 1), doctest.ELLIPSIS))
doctest_suite.addTest(doctest.DocTestCase(
InlineDoctestParser().get_doctest(object, vars, name, name, 1), checker=NullOutputCheck))
elif inspect.isfunction(object):
suite.addTest(unittest.FunctionTestCase(object))
doctest_suite._tests and suite.addTest(doctest_suite)
return suite
@pidgy.implementation
def post_run_cell(result):
shell = IPython.get_ipython()
globs, filename = shell.user_ns, F"In[{shell.last_execution_result.execution_count}]"
if not (result.error_before_exec or result.error_in_exec):
definitions = []
with ipython_compiler(shell):
while shell.definitions:
definition = shell.definitions.pop(0)
object = shell.user_ns.get(definition, None)
if definition.startswith('test_') or pidgy.util.istype(object, unittest.TestCase):
definitions.append(object)
result = run(make_test_suite(result.info.raw_cell, *definitions, vars=shell.user_ns, name=filename), result)
class Definitions(ast.NodeTransformer):
def visit_FunctionDef(self, node):
shell = IPython.get_ipython()
shell and shell.definitions.append(node.name)
return node
visit_ClassDef = visit_FunctionDef
def run(suite: unittest.TestCase, cell) -> unittest.TestResult:
result = unittest.TestResult(); suite.run(result)
if result.failures:
msg = '\n'.join(msg for text, msg in result.failures)
msg = re.sub(re.compile("<ipython-input-[0-9]+-\S+>"), F'In[{cell.execution_count}]', clean_doctest_traceback(msg))
sys.stderr.writelines((str(result) + '\n' + msg).splitlines(True))
return result
@contextlib.contextmanager
def ipython_compiler(shell):
We’ll have to replace how doctest
compiles code with the IPython
machinery.
def compiler(input, filename, symbol, *args, **kwargs):
nonlocal shell
return shell.compile(
ast.Interactive(
body=shell.transform_ast(
shell.compile.ast_parse(shell.transform_cell(textwrap.indent(input, ' '*4)))
).body),
F"In[{shell.last_execution_result.execution_count}]",
"single",
)
yield setattr(doctest, "compile", compiler)
doctest.compile = compile
def clean_doctest_traceback(str, *lines):
str = re.sub(re.compile("""\n\s+File [\s\S]+, line [0-9]+, in runTest\s+raise[\s\S]+\([\s\S]+\)\n?"""), '\n', str)
return re.sub(re.compile("Traceback \(most recent call last\):\n"), '', str)
Utilities for the testing module.
class NullOutputCheck(doctest.OutputChecker):
def check_output(self, *e): return True
class InlineDoctestParser(doctest.DocTestParser):
_EXAMPLE_RE = re.compile(r'`(?P<indent>\s{0})'
r'(?P<source>[^`].*?)'
r'`')
def _parse_example(self, m, name, lineno): return m.group('source'), None, "...", None
pidgy
metasyntax¶
[1]:
import pidgy, IPython, jinja2, doctest
`pidgy` not only allows the [Markdown] and [Python] to cooperate in a document,
metasyntaxes emerge at the interface between the language.
import pidgy, IPython, jinja2, doctest
pidgy
not only allows the [Markdown] and [Python] to cooperate in a document, metasyntaxes emerge at the interface between the language.
Markdown is the primary language¶
[2]:
`pidgy` considers [Markdown] indented code blocks and language free code fences
as valid [Python] code while every other object is represented as a triple
quoted block string.
print("Indented blocks are always code like in literate coffeescript.")
Indented blocks are always code like in literate coffeescript.
pidgy
considers [Markdown] indented code blocks and language free code fences as valid [Python] code while every other object is represented as a triple quoted block string.
print("Indented blocks are always code like in literate coffeescript.")
Executing code.¶
[3]:
There are two specific to ensure that code is executed in `pidgy`.
### Indented code.
Like in the prior cell, an indented code block is a specific token in Markdown
that `pidgy` recognizes canonically as code.
"This is code" # because of the indent.
### Code fences.
```
"I am code because no language is specified."
```
### Ignoring code.
Include a language with the code fence to skip code execution.
```alangage
Add alanguage specification to code fence to ignore its input.
```
Or, use html tags.
<pre><code>
I am explicit HMTL.
</code></pre>
There are two specific to ensure that code is executed in pidgy
.
Indented code.¶
Like in the prior cell, an indented code block is a specific token in Markdown that pidgy
recognizes canonically as code.
"This is code" # because of the indent.
Code fences.¶
"I am code because no language is specified."
Ignoring code.¶
Include a language with the code fence to skip code execution.
Add alanguage specification to code fence to ignore its input.
Or, use html tags.
I am explicit HMTL.
Testing code¶
[4]:
`pidgy` recognizes doctests, a literate programming approach to testing, in the input and executes them in a formal unittest
testing suite. `doctest` are identified by the `">>>"` line prefix.
>>> assert True
>>> print
<built-in function print>
>>> pidgy
<module...__init__.py'>
pidgy
recognizes doctests, a literate programming approach to testing, in the input and executes them in a formal unittest testing suite. doctest
are identified by the ">>>"
line prefix.
>>> assert True
>>> print
<built-in function print>
>>> pidgy
<module...__init__.py'>
Weaving and templating code¶
[5]:
`pidgy` permits the popular `jinja2` templating syntax. Any use of
templates references <code>{% raw %}{{}}{% endraw %}</code> will be filled in with
information from the current namespace.
There is a variable `foo` with the value <code>{{foo}}</code>.
foo = 20
pidgy
permits the popular jinja2
templating syntax. Any use of templates references {{}} will be filled in with information from the current namespace.
There is a variable foo
with the value 20.
foo = 20
Suppressing the weave output.¶
[6]:
`pidgy` will not render any input beginning with a blank line.
Emergent language features¶
[7]:
Interleaving Markdown and Python results in natural metasyntaxes that allow
`pidgy` authors to write programs that look like documentation.
### Docstrings.
[Markdown] that follows function and class definitions are wrapped as block strings
and indented according `pidgy`'s heuristics. What results is the [Markdown]
represents the docstring.
def my_function():
`my_function` demonstrates how docstrings are defined.
class MyClass:
The same goes for class definitions.
...
>>> my_function.__doc__
'`my_function` demonstrates how ...'
>>> MyClass.__doc__
'The same goes for class definitions.'
Interleaving Markdown and Python results in natural metasyntaxes that allow pidgy
authors to write programs that look like documentation.
Docstrings.¶
[Markdown] that follows function and class definitions are wrapped as block strings and indented according pidgy
’s heuristics. What results is the [Markdown] represents the docstring.
def my_function():
my_function
demonstrates how docstrings are defined.
class MyClass:
The same goes for class definitions.
...
>>> my_function.__doc__
'`my_function` demonstrates how ...'
>>> MyClass.__doc__
'The same goes for class definitions.'
# NBVAL_SKIP
Interactive Testing¶
Failures are treated as natural outputs of the documents. Tests may fail, but parts of the unit may be reusable.
def test_functions_start_with_test():
assert False, "False is not True"
assert False is not True
...
pidgy applications¶
pidgy
kernel¶
A kernel provides programming language support in Jupyter. IPython is the default kernel. Additional kernels include R, Julia, and many more.
pidgy
is a wrapper kernel around the
existing ipykernel and IPython.InteractiveShell
.
import IPython, ipykernel.ipkernel, ipykernel.kernelapp, pidgy, traitlets, ipykernel.kernelspec, ipykernel.zmqshell, pathlib
class pidgyKernel(ipykernel.ipkernel.IPythonKernel):
The pidgy
kernel specifies to jupyter
how it can be used as a native kernel from
the launcher or notebook. It specifies which shell class to use.
shell_class = traitlets.Type('pidgy.shell.pidgyShell')
loaders = traitlets.Dict()
_last_parent = traitlets.Dict()
current_cell_id = traitlets.Unicode()
current_cell_ids = traitlets.Set()
def init_metadata(self, *args, **kwargs):
The is some important data captured in the initial we’ll expose for later.
return super().init_metadata(*args, **kwargs)
def do_inspect(self, code, cursor_pos, detail_level=0):
The kernel is where the inspection can be customized. pidgy
adds the ability to use
the inspector as Markdown rendering tool.
if code[cursor_pos-3:cursor_pos] == '!!!':
if code[cursor_pos-6:cursor_pos] == '!!!'*2:
self.shell.run_cell(code, silent=True)
return self.markdown_result(self.shell.weave.format_output(code))
result = super().do_inspect(code, cursor_pos, detail_level)
if not result['found']: return self.markdown_result(code)
return result
def markdown_result(self, code):
return dict(found=True, status='ok', metadata={}, data={'text/markdown': code})
def do_complete(self, code, cursor_pos):
The kernel even allows the completion system to be modified.
return super().do_complete(code, cursor_pos)
pidgy
kernel installation¶
def install():
install
the pidgy kernel.
import jupyter_client, click
manager = jupyter_client.kernelspec.KernelSpecManager()
path = str((pathlib.Path(__file__).parent / 'kernelspec').absolute())
try:
dest = manager.install_kernel_spec(path, 'pidgy')
except:
click.echo(F"System install was unsuccessful. Attempting to install the pidgy kernel to the user.")
dest = manager.install_kernel_spec(path, 'pidgy', True)
click.echo(F"The pidgy kernel was install in {dest}")
def uninstall():
uninstall
the kernel.
import jupyter_client, click
jupyter_client.kernelspec.KernelSpecManager().remove_kernel_spec('pidgy')
click.echo(F"The pidgy kernel was removed.")
def start(f:str=""):
Launch a pidgy
kernel applications.
ipykernel.kernelapp.IPKernelApp.launch_instance(connection_file=f, kernel_class=pidgyKernel)
...
Export documents to other formats.¶
Literate programs can be translated into many formats. Commonly markup langauges focus on translation to other markup languages, we add an extra ability to convert markup to source code.
import pidgy, pathlib, typing, textwrap, IPython
try: from . import util
except: import util
We can reuse existing nbconvert machinery if we expand every file to a notebook.
def file_to_nb(file: pathlib.Path) -> "nbformat.NotebookNode":
import nbformat
if file.suffix in {'.md', '.markdown'}:
return nbformat.v4.new_notebook(cells=[nbformat.v4.new_code_cell(file.read_text())])
return nbformat.reads(file.read_text(), 4)
A notebook can also be flattened.
def flattennb(nb: typing.Union[str, "nbformat.NotebookNode"]):
if isinstance(nb, str): return nb
return [textwrap.indent(''.join(x.source), x.cell_type != 'code' and '# ' or '') for x in nb.cells]
def to_markup(input: typing.Union[str, "nbformat.NotebookNode"], exporter: "nbconvert.Exporter") -> str:
return exporter.from_notebook_node(input)[0]
def to_python(input, tangle = pidgy.tangle.pidgyManager()):
import black, isort
code = pidgy.loader.tangle(flattennb(input))
code = isort.SortImports(file_contents=code).output
code = black.format_str(code, mode=black.FileMode(line_length=100))
return code
def convert(*files, to: {'python', 'markdown'}, write: bool=False):
import nbconvert
exporter = nbconvert.get_exporter(to)()
for file in util.yield_files(files):
nb = file_to_nb(file)
if to =='python':
body = '\n'.join(to_python(''.join(getattr(x, 'source', []))) for x in nb.cells)
else:
body = to_markup(nb, exporter)
if write:
new = pathlib.Path(file).with_suffix(dict(python='.py', markdown='.md')[to])
new.write_text(body)
__import__('click').echo(F"{new} created.")
else:
__import__('click').echo(body)
https://stackoverflow.com/questions/34643620/how-can-i-split-my-click-commands-each-with-a-set-of-sub-commands-into-multipl
Scripting with literate programs.¶
Since pidgy
is based on [Python], derived pidgy documents can be used as scripts.
A pidgy
program executed as the main program has similar state to the running notebook and it introduces the file object.
pidgy
is based on [Python], a scripting language, therefore it should be possible execute markdown as scripts.
import types, pidgy, ast, runpy, importlib
__all__ = 'run', 'render', 'Runner'
def run(object: str, **globals) -> dict:
run
executes a literate document as a script.
return Runner(object).run(**globals)
def render(object: str, **globals) -> dict:
render
executes a templated document.
return Runner(object).render(**globals)
...
class Runner(pidgy.pidgyLoader):
A script Runner
for pidgy
documents based off the importnb
machinery.
def __init__(self, name, path=None, *args, **kwargs):
if path is None: path = name
super().__init__(name, path, *args, **kwargs)
def visit(self, node):
node = super().visit(node)
body, annotations = ast.Module([]), ast.Module([])
while node.body:
element = node.body.pop(0)
if isinstance(element, ast.AnnAssign) and element.target.id[0].islower():
try:
if element.value:
ast.literal_eval(element.value)
annotations.body.append(element)
continue
except: ...
if isinstance(element, (ast.Import, ast.ImportFrom)):
annotations.body.append(element)
body.body.append(element)
self.arg_code = compile(annotations, self.path, 'exec')
return body
def create_module(loader, spec=None):
When the module is created. Compile the source to code to discover arguments in the code.
if spec is None:
spec = importlib.util.spec_from_loader(loader.name, loader)
module = super().create_module(spec)
loader.main_code = loader.get_code(loader.name)
runpy._run_code(loader.arg_code, vars(module), {}, '__main__', spec, None, None)
return module
def exec_module(loader, module=None, **globals):
module = module or loader.create_module()
vars(module).update(globals)
runpy._run_code(loader.main_code, vars(module), {}, '__main__', module.__spec__, None, None)
return module
def run(loader, **globals):
return loader.exec_module(**globals)
def render(loader, **globals):
return loader.format(loader.run(**globals))
def cli(loader):
import pidgy.autocli, click
module = loader.create_module()
def main(verbose: bool=True, **globals):
nonlocal module
try:
loader.exec_module(module, **globals)
verbose and click.echo(pidgy.util.ansify(loader.format(module)))
except SystemExit: ...
pidgy.autocli.command_from_decorators(main,
click.option('--verbose/--silent', default=True),
*pidgy.autocli.decorators_from_module(module)).main()
def format(loader, module):
import nbconvert, operator, builtins
if loader.path.endswith(('.py', '.md', '.markdown')):
return nbconvert.TemplateExporter().environment.from_string(
pidgy.util.strip_front_matter(
pidgy.util.strip_html_comment(
pidgy.util.strip_shebang(
loader.decode())))
).render({
**vars(operator), **vars(builtins),
**vars(module)}).rstrip() + '\n'
...
shebang statements in literate programs.¶
A feature of pidgy
markdown files, not notebook files, is that a shebang statement can be included at the beginning to indicate how a document is executed.
Some useful shebang lines to being pidgy documents with.
#!/usr/bin/env pidgy run
#!/usr/bin/env python -m pidgy run
#!/usr/bin/env python -m pidgy render
#!/usr/bin/env pidgy render
Formally testing literate programs¶
import pidgy, pytest, doctest, importnb.utils.pytest_importnb
Literate documents can be motivated by the need to test a concept. In a fact, a common use case of notebooks is that they interactively test units of thought. Often the thought of reusability is an after thought.
pidgy
documents are meant to be treated as test objects. In fact, the pidgy
test suite
executed by pytest
through Github Actions uses pidgy
notebooks (ie. documents with the ".md" or ".md.ipynb"
extension). pidgy
supplies its own pytest
extensions, and it uses nbval
and the pytest
”–doctest-modules”flag. With these conditions we discover pytest conventions, unitests, doctests, and options cell input output validated. Ultimately,
pidgy` documents may represent units of literate that double as formal test objects.
The document accessed by the "pytest11"
console_script and includes the extension with a pytest runner.
class pidgyModule(importnb.utils.pytest_importnb.NotebookModule):
The pidgyModule
derives from an existing pytest
extension that extracts formal tests from notebook
s
as if they were regular python files. We’ll use the pidgy.pidgyLoader
to load Markdown-forward documents
as python objects.
loader = pidgy.pidgyLoader
class pidgyTests(importnb.utils.pytest_importnb.NotebookTests):
pidgyTests
makes sure to include the alternative source formats to tangle to python executions.
modules = pidgyModule,
pidgy
command line interface¶
import IPython, pidgy, pathlib, typing, click, functools, contextlib, sys, types
with pidgy.pidgyLoader():
try: from . import kernel, autocli, runpidgy, util, export, weave
except: import kernel, autocli, runpidgy, util, export, weave
def run(ctx, ref: str):
pidgy
run
executes pidgy
documents as programs.
import pidgy, click
click.echo(F"Running {ref}.")
with pidgy.util.sys_path(), pidgy.util.argv(*([ref] + ctx.args)):
runpidgy.run(ref)
def template(ctx, ref: str, no_show:bool=False):
pidgy
template
executes pidgy
documents as programs and publishes the templated results.
import pidgy, click
with pidgy.util.sys_path(), pidgy.util.argv(*([ref] + ctx.args)):
data = pidgy.runpidgy.render(ref)
if not no_show: click.echo(pidgy.util.ansify(data))
def to(to:{'markdown', 'python'}, files: typing.List[pathlib.Path], write:bool=False):
Convert pidgy documents to other formats.
pidgy.export.convert(*files, to=to, write=write)
def test(ctx, files: list):
Formally test markdown documents, notebooks, and python files.
import pytest
pytest.main(ctx.args+['--doctest-modules', '--disable-pytest-warnings']+list(files))
application = autocli.autoclick(
run, test, to, template,
autocli.autoclick(
pidgy.kernel.install, pidgy.kernel.uninstall, pidgy.kernel.start, group=click.Group("kernel")
),
context_settings=dict(allow_extra_args=True, ignore_unknown_options=True),
)
Discussion¶
Markdown for literate programming¶
Alternative source files¶
[1]:
import pidgy, pandas, types, builtins, pathlib
import pidgy, pandas, types, builtins, pathlib
[2]:
pidgy is a literate program and it uses a mix of files for its source. Until, the tangle step
of literate programming we have to rely on available files with avaiable improters,
afterwards markdown is used as source.
{{file_counts.to_html()}}
file_counts = pandas.Series({k:pathlib.Path(v.__file__).suffix for k,v in vars(pidgy).items() if isinstance(v, types.ModuleType) and v is not builtins}).value_counts().to_frame('extensions').T
pidgy is a literate program and it uses a mix of files for its source. Until, the tangle step of literate programming we have to rely on available files with avaiable improters, afterwards markdown is used as source.
file_counts = pandas.Series({k:pathlib.Path(v.__file__).suffix for k,v in vars(pidgy).items() if isinstance(v, types.ModuleType) and v is not builtins}).value_counts().to_frame('extensions').T
[3]:
## Shebangs
Perhaps one of the more damning shortcomings of the notebook is that it is no a script, and requires specialized software to execute.
`pidgy` markdown documents can begin with a shebang line that defines how the
script should be executed.
```bash
#!/usr/bin/env python -m pidgy run file.md
```
When the shebang is included the markdown file can be executed at the command line with a preceeding period.
```bash
./file.md
```
Shebangs¶
Perhaps one of the more damning shortcomings of the notebook is that it is no a script, and requires specialized software to execute.
pidgy
markdown documents can begin with a shebang line that defines how the script should be executed.
#!/usr/bin/env python -m pidgy run file.md
When the shebang is included the markdown file can be executed at the command line with a preceeding period.
./file.md
[4]:
## Outcomes
There are numerous outcomes of a source written literate code. With `pidgy`
as an example we can do more than simply tangle and weave a program.
Outcomes¶
There are numerous outcomes of a source written literate code. With pidgy
as an example we can do more than simply tangle and weave a program.
[5]:
## Best practices for literate programming in Markdown
import json
Our core moral commitment is to write literate programs, because:
> ...; surely nobody wants to admit writing an illiterate program.
>
> > - [Donald Knuth] _[Literate Programming]_
- Restart and run all or it didn't happen.
A document should be literate in all readable, reproducible, and reusable
contexts.
* [Markdown] documents are sufficient for literate progr.
Markdown documents that translate to python can encode literate programs in a
form that is better if version control systems that the `json` format that
encodes notebooks.
* All code should compute.
Testing code in a narrative provides supplemental meaning to the `"code"`
signifiers. They provide a test of veracity at least for the computational
literacy.
* [readme.md] is a good default name for a program.
Eventually authors will compose [readme.md] documents that act as both the
`"__init__"` method and `"__main__"` methods of the program.
* Each document should stand alone,
[despite all possibilities to fall.](http://ing.univaq.it/continenza/Corso%20di%20Disegno%20dell'Architettura%202/TESTI%20D'AUTORE/Paul-klee-Pedagogical-Sketchbook.pdf#page=6)
* Use code, data, and visualization to fill the voids of natural language.
* Find pleasure in writing.
* When in doubt, abide [Web Content Accessibility Guidelines][wcag] so that
information can be accessed by differently abled audiences.
* First markdown cell is the docstring
[wcag]: https://www.w3.org/WAI/standards-guidelines/wcag/
[donald knuth]: #
[literate programming]: #
[markdown]: #
[readme.md]: #
Best practices for literate programming in Markdown¶
import json
Our core moral commitment is to write literate programs, because:
…; surely nobody wants to admit writing an illiterate program.
- Donald Knuth `Literate Programming <#>`__
- Restart and run all or it didn’t happen.
A document should be literate in all readable, reproducible, and reusable contexts.
- Markdown documents are sufficient for literate progr.
Markdown documents that translate to python can encode literate programs in a form that is better if version control systems that the json
format that encodes notebooks.
- All code should compute.
Testing code in a narrative provides supplemental meaning to the "code"
signifiers. They provide a test of veracity at least for the computational literacy.
- readme.md is a good default name for a program.
Eventually authors will compose readme.md documents that act as both the "__init__"
method and "__main__"
methods of the program.
- Each document should stand alone, despite all possibilities to fall.
- Use code, data, and visualization to fill the voids of natural language.
- Find pleasure in writing.
- When in doubt, abide Web Content Accessibility Guidelines so that information can be accessed by differently abled audiences.
- First markdown cell is the docstring
[6]:
## External tools
`pidgy` works jupyter notebook, jupyterlab, nteract, and colab.
External tools¶
pidgy
works jupyter notebook, jupyterlab, nteract, and colab.
[7]:
# Conclusion
Markdown documents represent a compact form of a literate programming.
Conclusion¶
Markdown documents represent a compact form of a literate programming.
pidgy tests¶
Test pidgy.tangle
¶
[1]:
import pidgy, ast
[2]:
dir(pidgy.tangle)
[2]:
['BlockLexer',
'ExtraSyntax',
'IPython',
'Tangle',
'__all__',
'__builtins__',
'__doc__',
'__file__',
'__loader__',
'__name__',
'__package__',
'__spec__',
'__test__',
'ast',
'demojize',
'init_json',
'mistune',
'normalizer',
'pidgy',
'pidgyManager',
'tangle',
'typing']
[3]:
tangle = pidgy.tangle.Tangle()
[4]:
s = """---
a: front matter
---
This is a paragraph.
* a list
def f():
A docstring
print
"""
Unnormalized tokens.
[5]:
tangle.parse(s, normalize=False)
[5]:
[{'type': 'front_matter', 'text': '\na: front matter\n'},
{'type': 'paragraph', 'text': 'This is a paragraph.'},
{'type': 'list_start', 'ordered': False},
{'type': 'loose_item_start'},
{'type': 'text', 'text': 'a list'},
{'type': 'newline'},
{'type': 'text', 'text': ' def f():'},
{'type': 'list_item_end'},
{'type': 'list_end'},
{'type': 'paragraph', 'text': 'A docstring'},
{'type': 'code', 'lang': None, 'text': ' print'}]
Normalized block tokens
[6]:
tangle.parse(s, normalize=True)
[6]:
[{'type': 'front_matter', 'text': '\na: front matter\n'},
{'type': 'paragraph',
'text': '\nThis is a paragraph.\n\n* a list\n\n def f():\n \nA docstring'},
{'type': 'code', 'lang': None, 'text': '\n \n print'}]
Normalized block tokens
[7]:
print(tangle.stringify(tangle.parse(s)))
locals().update(__import__('ruamel.yaml').yaml.safe_load(
"""a: front matter"""
))
"""This is a paragraph.
* a list
def f():
A docstring"""
print
[8]:
transform = pidgy.tangle.pidgyManager().transform_cell
[9]:
print(transform(s))
locals().update(__import__('ruamel.yaml').yaml.safe_load(
"""a: front matter"""
))
"""This is a paragraph.
* a list
def f():
A docstring"""
print
[10]:
print(pidgy.tangle.demojize("""
🤖🐼 = 10
"""))
_robot_face__panda_face_ = 10
[11]:
ast.parse(transform("""
return 100
""")).body
[11]:
[<_ast.Return at 0x10b3afeb8>]
[12]:
pidgy.tangle.ExtraSyntax().visit(ast.parse(transform("""
return 100
"""))).body[0].value
[12]:
<_ast.Call at 0x109043080>
Test pidgy.runpidgy
¶
[1]:
import pidgy
import pidgy
[2]:
>>> dir(pidgy.runpidgy)
['Runner', '__all__', '__builtins__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__spec__', '__test__', 'ast', 'importlib', 'pidgy', 'render', 'run', 'runpy', 'types']
>>> dir(pidgy.runpidgy)
['Runner', '__all__', '__builtins__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__spec__', '__test__', 'ast', 'importlib', 'pidgy', 'render', 'run', 'runpy', 'types']
[3]:
foo = None
foo = None
[4]:
The `"templated_document.md"` file has `foo` annotated meaning that it will be recognized as CLI parameter.
"templated_document.md"
file has foo
annotated meaning that it will be recognized as CLI parameter.[5]:
>>> pidgy.runpidgy.run('templated_document.md')
__main__
<module 'templated_document.md' from 'templated_document.md'>
>>> pidgy.runpidgy.run('templated_document.md')
__main__
<module 'templated_document.md' from 'templated_document.md'>
[6]:
>>> pidgy.runpidgy.render('templated_document.md')
__main__
"\n...foo... is defined as 42\n\nMy document recieved ['...__main__.py', 'kernel', 'start', '-f', '...'] as arguments.\n"
>>> pidgy.runpidgy.render('templated_document.md')
__main__
"\n...foo... is defined as 42\n\nMy document recieved ['...__main__.py', 'kernel', 'start', '-f', '...'] as arguments.\n"
[7]:
>>> pidgy.runpidgy.render('templated_document.md', foo=900)
__main__
"\n...foo... is defined as 900\n\nMy document recieved ['...__main__.py', 'kernel', 'start', '-f', '...'] as arguments.\n"
>>> pidgy.runpidgy.render('templated_document.md', foo=900)
__main__
"\n...foo... is defined as 900\n\nMy document recieved ['...__main__.py', 'kernel', 'start', '-f', '...'] as arguments.\n"
[8]:
runner = pidgy.runpidgy.Runner('templated_document.md')
runner = pidgy.runpidgy.Runner('templated_document.md')
[9]:
>>> runner.render(foo=900)
__main__
"\n...foo... is defined as 900\n\nMy document recieved ['...__main__.py', 'kernel', 'start', '-f', '...'] as arguments.\n"
>>> runner.render(foo=900)
__main__
"\n...foo... is defined as 900\n\nMy document recieved ['...__main__.py', 'kernel', 'start', '-f', '...'] as arguments.\n"
[10]:
>>> runner.run()
__main__
<module 'templated_document.md' from 'templated_document.md'>
>>> runner.run()
__main__
<module 'templated_document.md' from 'templated_document.md'>
[11]:
>>> with pidgy.util.argv('script --foo 900'):
... try: runner.cli()
... except SystemExit: None
__main__
...foo... is defined as 900
<BLANKLINE>
My document recieved ['script', '--foo', '900'] as arguments.
<BLANKLINE>
>>> with pidgy.util.argv('script --foo 900'):
... try: runner.cli()
... except SystemExit: None
__main__
...foo... is defined as 900
My document recieved ['script', '--foo', '900'] as arguments.
[ ]:
Test pidgy.magic
¶
pidgy automatically provides the magics when it is imported interactively.
import pidgy
from IPython import get_ipython
%%tangle
This is my pidgy
print("This is my code")
%%tangle --tokens
This is my pidgy
print("This is my code")
foo = 900
%%render
I am a Markdown template that can display {{foo}}
[1]:
import pidgy
from IPython import get_ipython
ip = get_ipython()
import pidgy
from IPython import get_ipython
ip = get_ipython()
[2]:
if ip:
!pidgy
Usage: pidgy [OPTIONS] COMMAND [ARGS]...
Options:
--help Show this message and exit.
Commands:
kernel
run `pidgy` `run` executes `pidgy` documents as programs.
template `pidgy` `template` executes `pidgy` documents as programs and...
test Formally test markdown documents, notebooks, and python files.
to Convert pidgy documents to other formats.
if ip:
!pidgy
[3]:
if ip:
!pidgy run --help
Usage: pidgy run [OPTIONS] REF
`pidgy` `run` executes `pidgy` documents as programs.
Options:
--help Show this message and exit.
if ip:
!pidgy run --help
[4]:
if ip:
!pidgy test --help
Usage: pidgy test [OPTIONS] [FILES]...
Formally test markdown documents, notebooks, and python files.
Options:
--help Show this message and exit.
if ip:
!pidgy test --help
[5]:
if ip:
!pidgy kernel --help
Usage: pidgy kernel [OPTIONS] COMMAND [ARGS]...
Options:
--help Show this message and exit.
Commands:
install `install` the pidgy kernel.
start Launch a `pidgy` kernel applications.
uninstall `uninstall` the kernel.
if ip:
!pidgy kernel --help
[6]:
if ip:
!pidgy run templated_document.md
Running templated_document.md.
__main__
if ip:
!pidgy run templated_document.md
[7]:
if ip:
!pidgy template templated_document.md
__main__
`foo` is defined as 42
My document recieved ['templated_document.md'] as arguments.
if ip:
!pidgy template templated_document.md
[8]:
# NBVAL_SKIP
if ip:
!chmod u+x templated_document.md
!./templated_document.md
__main__
`foo` is defined as 42
My document recieved ['./templated_document.md'] as arguments.
# NBVAL_SKIP
if ip:
!chmod u+x templated_document.md
!./templated_document.md
Test 3rd party integrations.¶
[ ]:
import papermill, pathlib, tqdm
[1]:
def remove_output(path):
try: __import__('os').remove(path)
except: ...
[2]:
def test_execute():
in_, out = pathlib.Path('pidgy/tests/parameterized_notebook.md.ipynb'), pathlib.Path('pidgy/tests/_parameterized_notebook.md.ipynb')
assert in_.exists()
papermill.execute_notebook(str(in_), str(out), kernel_name='pidgy', parameters=dict(alpha=100))
assert out.exists()
remove_output(out)
Create a simple fastapi
application¶
fastapi
is a web application framework in [Python]. It uses type annotations to define endspoints.
import fastapi, click
Make an instance of a fastapi
application.
app = fastapi.FastAPI()
@app.get('/highlight/{str}')
def highlight_terminal(str:str):
import pygments.formatters.terminal256
return pygments.highlight(str, pygments.lexers.find_lexer_class_by_name('yaml')(), pygments.formatters.terminal256.Terminal256Formatter(style='bw'))
@app.get('/upper/{str}')
def upper(str:str):
upper
returns the uppercase value of the input string.
return str.upper()
@click.group() def cli(): …
@cli.command() def schema():
Display the schema
for our simple application.
click.echo(highlight_terminal(import(‘yaml’).safe_dump(app.openapi(), default_flow_style=False)))
@cli.command() def serve():
Serve the simple fastapi
application.
import(‘uvicorn’).run(app, host=”0.0.0.0”, port=8000)
__name__ == "__main__" and cli()
def _test_app():
__import__('nest_asyncio').apply()
import starlette.testclient
client = starlette.testclient.TestClient(app)
assert client.get('/upper/rawr').text == 'RAWR!'
Tidy Data¶
[2]:
> [...], a stack of elements is a common abstract data type used in computing. We would not think ‘to add’ two stacks as we would two integers.
>> Jeanette Wing - [Computational thinking and thinking about computing][computational thinking]
[computational thinking]: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2696102/
[…], a stack of elements is a common abstract data type used in computing. We would not think ‘to add’ two stacks as we would two integers. > Jeanette Wing - Computational thinking and thinking about computing
[3]:
A modernist style of notebook programming persists where documents are written as if programs are
starting for nothing. Meanwhile, authors of R programming language tend to begin with the assumption
that data exists and so does code. Notebook are a powerful substrate for working with data and
describing the logic behind different permutations.
pidgy was designed to weave projections of tabular into a computational documentation. Specifically,
we are concerned with the DataFrame, a popular tidy data abstraction that serves as a first
class data structure in scientific computing.
A modernist style of notebook programming persists where documents are written as if programs are starting for nothing. Meanwhile, authors of R programming language tend to begin with the assumption that data exists and so does code. Notebook are a powerful substrate for working with data and describing the logic behind different permutations.
pidgy was designed to weave projections of tabular into a computational documentation. Specifically, we are concerned with the DataFrame, a popular tidy data abstraction that serves as a first class data structure in scientific computing.
[4]:
import pandas as 🐼
import pandas as 🐼
[5]:
<hr/>
[6]:
The figure above illustrates the information in `df`.
A high level numeric project of this data's statistics are:
{{df.describe().to_html()}}
The statistics were created using measurements that look like the following data:
{{df.head(2).to_html()}}
df = 🐼.DataFrame([range(i, i+4) for i in range(10)], columns=list('abcd'))
df.plot();

The figure above illustrates the information in df
.
A high level numeric project of this data’s statistics are:
The statistics were created using measurements that look like the following data:
df = 🐼.DataFrame([range(i, i+4) for i in range(10)], columns=list('abcd'))
df.plot();
[7]:
<hr/>
[8]:
In technical writing we need to consider existing conventions like:
* Figures above captions
* Table below captions
It still remains to be seen where code canonically fits in reference to figures and tables.
[Why should a table caption be placed above the table?]
[Why should a table caption be placed above the table?]: https://tex.stackexchange.com/questions/3243/why-should-a-table-caption-be-placed-above-the-table
In technical writing we need to consider existing conventions like: * Figures above captions * Table below captions
It still remains to be seen where code canonically fits in reference to figures and tables.
[9]:
[notebook war]
[notebook war]: https://yihui.org/en/2018/09/notebook-war/
Figures¶
[1]:
from graphviz import Source
Ø = __name__ == '__main__'
from graphviz import Source
Ø = __name__ == '__main__'
[12]:
<!--
tangle_weave = Source(
digraph {rankdir=LR
subgraph cluster_pidgy {
label="PIDGY literate programming"
MD->MARKDOWN[label=WEAVE]
MARKDOWN->HTML[label=PANDOC style=dashed]
HTML->PDF[label=PRINT style=dashed]
MD->IPY[label=TANGLE]
IPY->PYC[label=PYTHON]
IPYNB->IPY[label=TANGLE style=dotted]
IPYNB->MARKDOWN[label=WEAVE]
}
subgraph cluster_knuth {
label="WEB literate programming"
WEB->TEX[label=WEAVE] TEX->DVI[label=TEX]
WEB->PAS[label=TANGLE] PAS->REL[label=PASCAL]
}
}
, filename='tange_weave.png')
tangle_weave.save()
if Ø: return tangle_weave
-->
tangle_weave:\
The diagram of tangling and weaving compares the original literate programming
approach to the markdown-forward approach. The weaving
step becomes an identity transformation. Tangling
markdown to python is a line-by-line, to maintain proper assertions,
transformation that requires wrapping `not "code"` in quotations and indenting them properly.
IPY is a superset of python provided by IPython and includes shell, magics, and macro commands.
tangle_weave:\
The diagram of tangling and weaving compares the original literate programming approach to the markdown-forward approach. The weaving step becomes an identity transformation. Tangling markdown to python is a line-by-line, to maintain proper assertions, transformation that requires wrapping not "code"
in quotations and indenting them properly. IPY is a superset of python provided by IPython and includes shell, magics, and macro commands.
Source¶
API Reference¶
This page contains auto-generated API reference documentation [1].
pidgy
¶
Subpackages¶
Submodules¶
pidgy.autocli
¶
-
pidgy.autocli.
autoclick
(*object: typing.Union[types.FunctionType, click.Command], group=None, **settings) → click.Command[source]¶ Automatically generate a click command line application using type inference.
pidgy.util
¶
-
class
pidgy.util.
ContextDepth
[source]¶ Count the current depth of a context manager invocation.
-
depth
= 0¶
-
-
pidgy.util.
WHITESPACE
¶
-
pidgy.util.
num_first_indent
(text: str) → int[source]¶ The number of indents for the first blank line.
-
pidgy.util.
base_indent
(tokens: typing.List[dict]) → int[source]¶ Peek into mistune tokens and find the last code indent.
-
pidgy.util.
whiten
(text: str) → str[source]¶ whiten strips empty lines because the markdown.BlockLexer doesn’t like that.
-
pidgy.util.
strip_front_matter
(text: str, sep=None) → str[source]¶ Remove yaml front matter froma string.
-
pidgy.util.
yield_files
(files: typing.Sequence[str], recursive=False) → typing.Generator[source]¶ Return a list of files from a collection of files and globs.
-
pidgy.util.
strip_front_matter
(text: str, sep=None) → str[source] Remove yaml front matter froma string.
[1] | Created with sphinx-autoapi |