class documentation

Export a Report as MS Word docx file.

The docx (known as python-docx) package is used for that.

Method __init__ Ctor.
Method add_paragraph_style Add a user-definied paragraph style to the document.
Method available_styles Return a dict of available styles.
Method control_to Convert _ControlType elements into MS Word document components.
Method dataframe_to_table Convert a pandas.DataFrame in a MS Word table.
Method image Embed an image into the document.
Method page_numbering Set the page numbering behavior.
Method save Start the export of each element into the docx file.
Method set_page_orientation_landscape Page/section in landscape orientation.
Method set_page_orientation_portrait Page/section in portrait orientation.
Method string_to_heading Add heading.
Method string_to_paragraph Add paragraph.
Method strings_to_enumeration Add a paragraph as enumeration item.
Method toggle_page_orientation Toggle the current page/section's orientation.
Instance Variable file_path Path to the file.
Static Method _add_column_labels_one_level Use in _add_row_and_column_labels().
Static Method _add_column_labels_two_levels Use in _add_row_and_column_labels().
Static Method _add_field Add a _field_ to an existing _run_.
Static Method _add_page_number Add a page number field to an existing paragraph.
Static Method _add_row_and_column_labels Format the labels of rows and columns.
Static Method _add_row_labels_one_level Use in _add_row_and_column_labels().
Static Method _add_row_labels_two_levels Use in _add_row_and_column_labels().
Static Method _cell_value_to_formated_string Convert pandas.DataFrame cell values to a MS Word table.
Static Method _color_to_rgbcolor Convert "color" value into docx (python-docx) format.
Static Method _element_with_attribute Create an XML element with attribute and value.
Static Method _multiindex_to_dict Convert a MultiIndex to dictionary.
Static Method _table_autofit Autofit objects in table.
Method _activate_auto_update_fields All fields will semi-automatic updated.
Method _caption Add caption paragraph including name and numbering.
Method _init_docx_document_instance Initiate and setup the document instance self._doc.
Method _init_header_and_footer Set up the header and footer elements of the document.
Method _init_styles Create some additional styles.
Method _set_margins See __init__() for details.
Method _set_page_orientation Set page orientation if different.
Method _setup_language Documents language.
Method _setup_spell_and_grammar_checking Undocumented
Method _table_of_something Generic method to create a reference list (e.g. TOC).
Instance Variable _additional_styles List of user defined styles add via add_paragraph_style().
Instance Variable _autoupdate_fields Activate the auto update field feature.
Instance Variable _captions_position_figure Position of figure captions on top (0) or bottom of a figure. Bottom is default.
Instance Variable _captions_position_table Position of table captions on top (0) or bottom of a table. Bottom is default.
Instance Variable _doc Instance of docx.document.Document.
Instance Variable _footer_text String for the document footer.
Instance Variable _header_text String for the document header.
Instance Variable _language The language relevant for spell checking.
Instance Variable _margins A 4 item tuple with millimeter values clockwise starting with top.
Instance Variable _page_numbering A dict with settings about page numbering.
Instance Variable _spell_and_grammar_checking Document global checking of spelling and grammar.

Inherited from _GenericDocumentOutput:

Method show Open the document with the associated application.
Instance Variable tags_allowed List of tags for elements that are allowed.
Instance Variable tags_excluded List of tags for elements that are excluded.
Method _not_implemented Handle not implemented elements and throws a warning.
def __init__(self, file_path: pathlib.Path, *, margins: str | tuple = 'narrow', autoupdate_fields: bool = True, tags_allowed: list[str] | str = None, tags_excluded: list[str] | str = None, text_header: str = None, text_footer: str = None, spell_and_grammar_checking: bool = False, language: str = None, captions_position_table: int = 1, captions_position_figure: int = 1): (source)

Ctor.

Parameters
file_path:pathlib.PathPath to the docx file to generate.
margins:str | tupleNames of predefined margins (narrow or moderate), a 4 item tuple with millimeter values clockwise starting with top, or a 2 item tuple with millimeter values for top-buttom and left-right.
autoupdate_fields:boolActivate auto-updated all fields in the docx.
tags_allowed:list[str] | strSee _GenericDocumentOutput.
tags_excluded:list[str] | strSee _GenericDocumentOutput.
text_header:strText to put in document header.
text_footer:strText to put in document footer.
spell_and_grammar_checking:boolActivate checking of spelling and grammar.
language:strSpecify language used.
captions_position_table:intCaptions before (0) or after (1, default) a table.
captions_position_figure:intCaptions before (0) or after (1, default) a figure.
def add_paragraph_style(self, name: str, *, based_on: str = None, font_name: str = None, font_pt: int = None, font_color: tuple[int, int, int] | str = None, font_bgcolor: tuple[int, int, int] | str = None): (source)

Add a user-definied paragraph style to the document.

A style can then be used via its name when adding new paragraphs:

doc = DocxDocument(fp)
report = Report()

# Create the style
doc.add_paragraph_style(name='MyStyle', font_name='Fira Code')

# Use the style
report.paragraph('lore ipsum', style='MyStlye')
Parameters
name:strName of the new style.
based_on:strParent style that style should inherit from.
font_name:strFont name.
font_pt:intFont size in points.
font_color:tuple[int, int, int] | strForeground color as name, Hex-value or RGB value tuple.
font_bgcolor:tuple[int, int, int] | strBackground color of the font.
def available_styles(self) -> dict[Hashable, list]: (source)

Return a dict of available styles.

The dict keys are the categories of styles. The values are the style names as list.

Note

Seems to be unused. Also makes not much sense because _doc is None except while calling save().

def control_to(self, control_type: _ControlType): (source)

Convert _ControlType elements into MS Word document components.

Such components are page breaks, page orientation, reference lists for headings, figures and tables.

def dataframe_to_table(self, df: pandas.DataFrame, *, caption: str, note: str, autofit: bool, decimal_places: int = 2): (source)

Convert a pandas.DataFrame in a MS Word table.

Long tables can cause performance problems. A warning is logged on long tables.

Parameters
df:pandas.DataFrameThe data frame.
caption:strString for the table caption.
note:strUsed as (foot)note.
autofit:boolAuto adjust the width of the table to its content.
decimal_places:intRound of float values.
def image(self, image_bytes: bytes, scale_factor: float = None, caption: str = None): (source)

Embed an image into the document.

The original picture source is not scaled, resized or converted. Scaling is done "dynamic" via MS Word specifying the view-size in Millimeters. That value is calculated based on original pixel size, the DPI and the scaling factor scale_factor.

Plots and figures from matplotlib or derivates are saved in PNG format with DEFAULT_DPI dpi.

Parameters
image_bytes:bytesThe image as bytes.
scale_factor:floatFactor to scale.
caption:strImage caption string.
def page_numbering(self, pos: int | str, *, prefix: str = '\tPage ', with_pagenum: bool = True, infix: str = ' of ', suffix: str = None): (source)

Set the page numbering behavior.

Parameters
pos:int | strIndicates header (0/header) or footer (1/footer).
prefix:strString in front of the page number.
with_pagenum:boolAdd count of all pages.
infix:strString between current page number and all pages count.
suffix:strString at the end.
def save(self, report: Report) -> _GenericDocumentOutput: (source)

Start the export of each element into the docx file.

Parameters
report:ReportThe report instance with all elements.
Returns
_GenericDocumentOutputItself.
def set_page_orientation_landscape(self): (source)

Page/section in landscape orientation.

def set_page_orientation_portrait(self): (source)

Page/section in portrait orientation.

def string_to_heading(self, text: str, level: int): (source)

Add heading.

def string_to_paragraph(self, text: str, style: str): (source)

Add paragraph.

def strings_to_enumeration(self, items: tuple[str], ordered: bool, style: str): (source)

Add a paragraph as enumeration item.

The argument ordered is ignored if argument style is used.

Parameters
items:tuple[str]List of strings as enumeration items.
ordered:boolIf True style 'List Number' is used otherwise 'List Bullet'.
style:strExplicit naming the style.
def toggle_page_orientation(self, section: docx.section.Section = None): (source)

Toggle the current page/section's orientation.

@staticmethod
def _add_column_labels_one_level(df, tab): (source)
@staticmethod
def _add_column_labels_two_levels(df, tab): (source)
@staticmethod
def _add_field(run, field): (source)

Add a _field_ to an existing _run_.

A _run_ is a part of a paragraph. Fields are visible in MS Word as elements with gray background (e.g. for page numbers or citation references).

Credits: https://github.com/python-openxml/python-docx /issues/498#issuecomment-394143566

Parameters
runThe run.
fieldThe field.
@staticmethod
def _add_page_number(paragraph: docx.text.paragraph.Paragraph, with_pagenum: bool, prefix: str, infix: str, suffix: str): (source)

Add a page number field to an existing paragraph.

Credits: https://stackoverflow.com/a/62534711/4865723

@staticmethod
def _add_row_and_column_labels(df, tab): (source)

Format the labels of rows and columns.

@staticmethod
def _add_row_labels_one_level(df, tab): (source)
@staticmethod
def _add_row_labels_two_levels(df, tab): (source)
@staticmethod
def _cell_value_to_formated_string(cell_value, decimal_places) -> str: (source)

Convert pandas.DataFrame cell values to a MS Word table.

The original cell_value is converted into a string. Decimal point and Thousand delimiter used based on the current locale.LC_NUMERIC. A float is rounded based on decimal_places using round(). Items of an iterable are treated the same way.

@staticmethod
def _color_to_rgbcolor(color: str | tuple[int, int, int]) -> docx.shared.RGBColor: (source)

Convert "color" value into docx (python-docx) format.

The package webcolors is used to do the conversion.

Parameters
color:str | tuple[int, int, int]Name of a color, Hex Color code as string or an integer tuple with RGB values for red, green and blue.
Returns
docx.shared.RGBColorUndocumented
@staticmethod
def _element_with_attribute(element_name: str, attribute_name: str, attribute_value: str) -> docx.oxml.OxmlElement: (source)

Create an XML element with attribute and value.

@staticmethod
def _multiindex_to_dict(idx: pandas.MultiIndex) -> dict: (source)

Convert a MultiIndex to dictionary.

Helper function used by _add_row_labels_two_levels() and _add_column_labels_two_levels().

Parameters
idx:pandas.MultiIndexIndex object (columns or rows of a pandas dataframe)
Returns
dictThe dictionary indexed by the first level of the multi index.
@staticmethod
def _table_autofit(tab: docx.table.Table): (source)

Autofit objects in table.

Credits: https://github.com/python-openxml/python-docx /issues/209#issuecomment-566128709

def _activate_auto_update_fields(self): (source)

All fields will semi-automatic updated.

This includes table of content and numbering in table and figure captions. In practice when opening such a docx file MS Word will ask if all fields should be updated.

Credits: https://stackoverflow.com/a/63799828/4865723

def _caption(self, caption: str, target: str): (source)

Add caption paragraph including name and numbering.

Based on: https://github.com/python-openxml/python-docx/issues/359

Parameters
caption:strThe content of the caption.
target:strThe prefix name (e.g. _Table_ or _Figure_)
def _init_docx_document_instance(self): (source)

Initiate and setup the document instance self._doc.

The method is not called by __init__() but save().

def _init_header_and_footer(self): (source)

Set up the header and footer elements of the document.

The behavior is based on the instance attributes _page_numbering, _header_text and _footer_text.

def _init_styles(self): (source)

Create some additional styles.

def _set_margins(self, margins: str | tuple) -> tuple[int, int, int, int]: (source)

See __init__() for details.

def _set_page_orientation(self, orientation): (source)

Set page orientation if different.

def _setup_language(self): (source)

Documents language.

The language is set based on _language for all existing styles.

def _setup_spell_and_grammar_checking(self): (source)

Undocumented

def _table_of_something(self, kind: _ReferenceKind, depth: int): (source)

Generic method to create a reference list (e.g. TOC).

It is used for table_of_contents().

Credits

The code is based on the following sources

- ttps://stackoverflow.com/a/59170642/4865723
- https://github.com/python-openxml/python-docx/issues/36
- https://github.com/xiaominzhaoparadigm/python-docx-add-list-of-tables-figures/blob/master/LOT.py
_additional_styles: list = (source)

List of user defined styles add via add_paragraph_style().

_autoupdate_fields = (source)

Activate the auto update field feature.

_captions_position_figure = (source)

Position of figure captions on top (0) or bottom of a figure. Bottom is default.

_captions_position_table = (source)

Position of table captions on top (0) or bottom of a table. Bottom is default.

Instance of docx.document.Document.

_footer_text = (source)

String for the document footer.

_header_text = (source)

String for the document header.

_language = (source)

The language relevant for spell checking.

_margins = (source)

A 4 item tuple with millimeter values clockwise starting with top.

_page_numbering = (source)

A dict with settings about page numbering.

_spell_and_grammar_checking = (source)

Document global checking of spelling and grammar.