Simplify dviFontInfo layout in backend pdf. #30082

anntzer · 2025-05-19T22:00:30Z

Use a simpler deterministic mapping of tex font names to pdf embedding names.
Only resolve the required attributes when needed (in _embedTeXFont), which avoids e.g. having to carry around and worry about attributes with different names (e.g. "encoding" vs. "encodingfile").

followup to #30027.

PR summary

PR checklist

"closes #0000" is in the body of the PR description to link the related issue
new and changed code is tested
Plotting related features are demonstrated in an example
New Features and API Changes are noted with a directive and release note
Documentation complies with general and docstring guidelines

lib/matplotlib/backends/backend_pdf.py

jkseppan

This has a merge conflict with my recent changes; I'll push my suggested merge somewhere

jkseppan · 2025-06-24T15:32:33Z

The merge is commit b0e10dc in jkseppan:simple-dvifontinfo and I made a PR into your branch, but it looks terrible in the GitHub interface. Command-line git is much better:

commit b0e10dc6f9ca392e79779d95175bd93a80076f3a
Merge: d7499b9b98 0dcd06f812
Author: Jouni K. Seppänen <[email protected]>
Date:   2025-06-24 18:26:38 +0300

    Merge branch 'main' into simple-dvifontinfo

diff --cc lib/matplotlib/backends/backend_pdf.py
index fc37a21bf4,6d6bea585f..9bb76d3688
--- a/lib/matplotlib/backends/backend_pdf.py
+++ b/lib/matplotlib/backends/backend_pdf.py
@@@ -716,19 -716,17 +716,17 @@@ class PdfFile
          root = {'Type': Name('Catalog'),
                  'Pages': self.pagesObject}
          self.writeObject(self.rootObject, root)
  
          self.infoDict = _create_pdf_info_dict('pdf', metadata or {})
  
          self._internal_font_seq = (Name(f'F{i}') for i in itertools.count(1))
          self._fontNames = {}     # maps filenames to internal font names
 -        self._dviFontInfo = {}   # maps dvi font names to embedding information
 +        self._dviFontInfo = {}   # maps pdf names to dvifonts
-         # differently encoded Type-1 fonts may share the same descriptor
-         self._type1Descriptors = {}
          self._character_tracker = _backend_pdf_ps.CharacterTracker()
  
          self.alphaStates = {}   # maps alpha values to graphics state objects
          self._alpha_state_seq = (Name(f'A{i}') for i in itertools.count(1))
          self._soft_mask_states = {}
          self._soft_mask_seq = (Name(f'SM{i}') for i in itertools.count(1))
          self._soft_mask_groups = []
          self._hatch_patterns = {}
@@@ -761,41 -759,19 +759,40 @@@
                       'XObject': self.XObjectObject,
                       'ExtGState': self._extGStateObject,
                       'Pattern': self.hatchObject,
                       'Shading': self.gouraudObject,
                       'ProcSet': procsets}
          self.writeObject(self.resourceObject, resources)
  
      fontNames = _api.deprecated("3.11")(property(lambda self: self._fontNames))
-     type1Descriptors = _api.deprecated("3.11")(
-         property(lambda self: self._type1Descriptors))
 -    dviFontInfo = _api.deprecated("3.11")(property(lambda self: self._dviFontInfo))
+     type1Descriptors = _api.deprecated("3.11")(property(lambda _: {}))
  
 +    @_api.deprecated("3.11")
 +    @property
 +    def dviFontInfo(self):
 +        d = {}
 +        tex_font_map = dviread.PsfontsMap(dviread.find_tex_file('pdftex.map'))
 +        for pdfname, dvifont in self._dviFontInfo.items():
 +            psfont = tex_font_map[dvifont.texname]
 +            if psfont.filename is None:
 +                raise ValueError(
 +                    "No usable font file found for {} (TeX: {}); "
 +                    "the font may lack a Type-1 version"
 +                    .format(psfont.psname, dvifont.texname))
 +            d[dvifont.texname] = types.SimpleNamespace(
 +                dvifont=dvifont,
 +                pdfname=pdfname,
 +                fontfile=psfont.filename,
 +                basefont=psfont.psname,
 +                encodingfile=psfont.encoding,
 +                effects=psfont.effects,
 +            )
 +        return d
 +
      def newPage(self, width, height):
          self.endStream()
  
          self.width, self.height = width, height
          contentObject = self.reserveObject('page contents')
          annotsObject = self.reserveObject('annotations')
          thePage = {'Type': Name('Page'),
                     'Parent': self.pagesObject,
@@@ -986,72 -993,74 +994,78 @@@
          fontdict = {'Type': Name('Font'),
                      'Subtype': Name('Type1'),
                      'BaseFont': Name(fontname),
                      'Encoding': Name('WinAnsiEncoding')}
          fontdictObject = self.reserveObject('font dictionary')
          self.writeObject(fontdictObject, fontdict)
          return fontdictObject
  
 -    def _embedTeXFont(self, fontinfo):
 -        _log.debug('Embedding TeX font %s - fontinfo=%s',
 -                   fontinfo.dvifont.texname, fontinfo.__dict__)
 +    def _embedTeXFont(self, dvifont):
 +        tex_font_map = dviread.PsfontsMap(dviread.find_tex_file('pdftex.map'))
 +        psfont = tex_font_map[dvifont.texname]
 +        if psfont.filename is None:
 +            raise ValueError(
 +                "No usable font file found for {} (TeX: {}); "
 +                "the font may lack a Type-1 version"
 +                .format(psfont.psname, dvifont.texname))
  
-         # Widths
-         widthsObject = self.reserveObject('font widths')
-         tfm = dvifont._tfm
-         # convert from TeX's 12.20 representation to 1/1000 text space units.
-         widths = [(1000 * metrics.tex_width) >> 20
-                   if (metrics := tfm.get_metrics(char)) else 0
-                   for char in range(max(tfm._glyph_metrics, default=-1) + 1)]
-         self.writeObject(widthsObject, widths)
- 
-         # Font dictionary
+         # The font dictionary is the top-level object describing a font
          fontdictObject = self.reserveObject('font dictionary')
          fontdict = {
              'Type':      Name('Font'),
              'Subtype':   Name('Type1'),
-             'FirstChar': 0,
-             'LastChar':  len(widths) - 1,
-             'Widths':    widthsObject,
-             }
+         }
  
-         # Encoding (if needed)
-         if psfont.encoding is not None:
-             fontdict['Encoding'] = {
-                 'Type': Name('Encoding'),
-                 'Differences': [
-                     0, *map(Name, dviread._parse_enc(psfont.encoding))],
-             }
- 
-         # We have a font file to embed - read it in and apply any effects
+         # Read the font file and apply any encoding changes and effects
 -        t1font = _type1font.Type1Font(fontinfo.fontfile)
 -        if fontinfo.encodingfile is not None:
 +        t1font = _type1font.Type1Font(psfont.filename)
++        if psfont.encoding is not None:
+             t1font = t1font.with_encoding(
 -                {i: c for i, c in enumerate(dviread._parse_enc(fontinfo.encodingfile))}
++                {i: c for i, c in enumerate(dviread._parse_enc(psfont.encoding))}
+             )
 -        if fontinfo.effects:
 -            t1font = t1font.transform(fontinfo.effects)
 +        if psfont.effects:
 +            t1font = t1font.transform(psfont.effects)
+ 
+         # Reduce the font to only the glyphs used in the document, get the encoding
+         # for that subset, and compute various properties based on the encoding.
 -        chars = frozenset(self._character_tracker.used[fontinfo.dvifont.fname])
++        chars = frozenset(self._character_tracker.used[dvifont.fname])
+         t1font = t1font.subset(chars, self._get_subset_prefix(chars))
          fontdict['BaseFont'] = Name(t1font.prop['FontName'])
+         # createType1Descriptor writes the font data as a side effect
+         fontdict['FontDescriptor'] = self.createType1Descriptor(t1font)
+         encoding = t1font.prop['Encoding']
+         fontdict['Encoding'] = self._generate_encoding(encoding)
+         fc = fontdict['FirstChar'] = min(encoding.keys(), default=0)
+         lc = fontdict['LastChar'] = max(encoding.keys(), default=255)
  
-         # Font descriptors may be shared between differently encoded
-         # Type-1 fonts, so only create a new descriptor if there is no
-         # existing descriptor for this font.
-         effects = (psfont.effects.get('slant', 0.0),
-                    psfont.effects.get('extend', 1.0))
-         fontdesc = self._type1Descriptors.get((psfont.filename, effects))
-         if fontdesc is None:
-             fontdesc = self._type1Descriptors[psfont.filename, effects] = \
-                 self.createType1Descriptor(t1font)
-         fontdict['FontDescriptor'] = fontdesc
- 
+         # Convert glyph widths from TeX 12.20 fixed point to 1/1000 text space units
 -        tfm = fontinfo.dvifont._tfm
++        tfm = dvifont._tfm
+         widths = [(1000 * metrics.tex_width) >> 20
+                   if (metrics := tfm.get_metrics(char)) else 0
+                   for char in range(fc, lc + 1)]
+         fontdict['Widths'] = widthsObject = self.reserveObject('glyph widths')
+         self.writeObject(widthsObject, widths)
 -
          self.writeObject(fontdictObject, fontdict)
          return fontdictObject
  
+ 
+     def _generate_encoding(self, encoding):
+         prev = -2
+         result = []
+         for code, name in sorted(encoding.items()):
+             if code != prev + 1:
+                 result.append(code)
+             prev = code
+             result.append(Name(name))
+         return {
+             'Type': Name('Encoding'),
+             'Differences': result
+         }
+ 
+ 
      @_api.delete_parameter("3.11", "fontfile")
      def createType1Descriptor(self, t1font, fontfile=None):
          # Create and write the font descriptor and the font file
          # of a Type-1 font
          fontdescObject = self.reserveObject('font descriptor')
          fontfileObject = self.reserveObject('font file')
  
          italic_angle = t1font.prop['ItalicAngle']

jkseppan · 2025-06-24T15:55:58Z

Or, easier to read after a rebase: jkseppan:simple-dvifontinfo-rebase, commit 9a8c078.

anntzer · 2025-06-24T16:18:42Z

Thanks for providing the rebase.

jkseppan · 2025-06-24T17:44:37Z

The appveyor failure looks like a timeout in the final phases, but the actual tests passed. The coverage decrease is because the deprecated dviFontInfo property does not get exercised in tests. I'm not convinced that it needs a test; feel free to self-merge either as is or with an additional test for that.

- Use a simpler deterministic mapping of tex font names to pdf embedding names. - Only resolve the required attributes when needed (in _embedTeXFont), which avoids e.g. having to carry around and worry about attributes with different names (e.g. "encoding" vs. "encodingfile").

anntzer · 2025-06-24T22:19:23Z

looks like everything got randomly fixed after a minor cleanup removing extra whitespace.

anntzer added backend: pdf Maintenance labels May 19, 2025

github-actions bot added the status: needs rebase label May 30, 2025

QuLogic reviewed May 30, 2025

View reviewed changes

lib/matplotlib/backends/backend_pdf.py Show resolved Hide resolved

lib/matplotlib/backends/backend_pdf.py Outdated Show resolved Hide resolved

lib/matplotlib/backends/backend_pdf.py Show resolved Hide resolved

lib/matplotlib/backends/backend_pdf.py Outdated Show resolved Hide resolved

anntzer force-pushed the simple-dvifontinfo branch from 36a6703 to d7499b9 Compare May 30, 2025 06:17

github-actions bot removed the status: needs rebase label May 30, 2025

QuLogic added this to the v3.11.0 milestone May 30, 2025

QuLogic approved these changes May 30, 2025

View reviewed changes

QuLogic added this to Font and text overhaul Jun 5, 2025

github-project-automation bot moved this to Waiting for other PR in Font and text overhaul Jun 5, 2025

QuLogic moved this from Waiting for other PR to Ready for Review in Font and text overhaul Jun 5, 2025

github-actions bot added the status: needs rebase label Jun 6, 2025

jkseppan approved these changes Jun 24, 2025

View reviewed changes

anntzer force-pushed the simple-dvifontinfo branch from d7499b9 to 272dda0 Compare June 24, 2025 16:18

github-actions bot removed the status: needs rebase label Jun 24, 2025

anntzer force-pushed the simple-dvifontinfo branch from 272dda0 to e3c7a64 Compare June 24, 2025 20:49

QuLogic merged commit fed8c20 into matplotlib:main Jun 25, 2025
40 checks passed

github-project-automation bot moved this from Ready for Review to Done in Font and text overhaul Jun 25, 2025

anntzer deleted the simple-dvifontinfo branch June 25, 2025 05:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Simplify dviFontInfo layout in backend pdf. #30082

Simplify dviFontInfo layout in backend pdf. #30082

Uh oh!

anntzer commented May 19, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jkseppan left a comment

Uh oh!

jkseppan commented Jun 24, 2025

Uh oh!

jkseppan commented Jun 24, 2025

Uh oh!

anntzer commented Jun 24, 2025

Uh oh!

jkseppan commented Jun 24, 2025

Uh oh!

anntzer commented Jun 24, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Simplify dviFontInfo layout in backend pdf. #30082

Simplify dviFontInfo layout in backend pdf. #30082

Uh oh!

Conversation

anntzer commented May 19, 2025

PR summary

PR checklist

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jkseppan left a comment

Choose a reason for hiding this comment

Uh oh!

jkseppan commented Jun 24, 2025

Uh oh!

jkseppan commented Jun 24, 2025

Uh oh!

anntzer commented Jun 24, 2025

Uh oh!

jkseppan commented Jun 24, 2025

Uh oh!

anntzer commented Jun 24, 2025

Uh oh!

Uh oh!

Uh oh!