Skip to content

Simplify dviFontInfo layout in backend pdf. #30082

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 25, 2025

Conversation

anntzer
Copy link
Contributor

@anntzer anntzer commented May 19, 2025

  • Use a simpler deterministic mapping of tex font names to pdf embedding names.
  • Only resolve the required attributes when needed (in _embedTeXFont), which avoids e.g. having to carry around and worry about attributes with different names (e.g. "encoding" vs. "encodingfile").

followup to #30027.

PR summary

PR checklist

@anntzer anntzer force-pushed the simple-dvifontinfo branch from 36a6703 to d7499b9 Compare May 30, 2025 06:17
@QuLogic QuLogic added this to the v3.11.0 milestone May 30, 2025
@github-project-automation github-project-automation bot moved this to Waiting for other PR in Font and text overhaul Jun 5, 2025
@QuLogic QuLogic moved this from Waiting for other PR to Ready for Review in Font and text overhaul Jun 5, 2025
Copy link
Member

@jkseppan jkseppan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has a merge conflict with my recent changes; I'll push my suggested merge somewhere

@jkseppan
Copy link
Member

The merge is commit b0e10dc in jkseppan:simple-dvifontinfo and I made a PR into your branch, but it looks terrible in the GitHub interface. Command-line git is much better:

commit b0e10dc6f9ca392e79779d95175bd93a80076f3a
Merge: d7499b9b98 0dcd06f812
Author: Jouni K. Seppänen <[email protected]>
Date:   2025-06-24 18:26:38 +0300

    Merge branch 'main' into simple-dvifontinfo

diff --cc lib/matplotlib/backends/backend_pdf.py
index fc37a21bf4,6d6bea585f..9bb76d3688
--- a/lib/matplotlib/backends/backend_pdf.py
+++ b/lib/matplotlib/backends/backend_pdf.py
@@@ -716,19 -716,17 +716,17 @@@ class PdfFile
          root = {'Type': Name('Catalog'),
                  'Pages': self.pagesObject}
          self.writeObject(self.rootObject, root)
  
          self.infoDict = _create_pdf_info_dict('pdf', metadata or {})
  
          self._internal_font_seq = (Name(f'F{i}') for i in itertools.count(1))
          self._fontNames = {}     # maps filenames to internal font names
 -        self._dviFontInfo = {}   # maps dvi font names to embedding information
 +        self._dviFontInfo = {}   # maps pdf names to dvifonts
-         # differently encoded Type-1 fonts may share the same descriptor
-         self._type1Descriptors = {}
          self._character_tracker = _backend_pdf_ps.CharacterTracker()
  
          self.alphaStates = {}   # maps alpha values to graphics state objects
          self._alpha_state_seq = (Name(f'A{i}') for i in itertools.count(1))
          self._soft_mask_states = {}
          self._soft_mask_seq = (Name(f'SM{i}') for i in itertools.count(1))
          self._soft_mask_groups = []
          self._hatch_patterns = {}
@@@ -761,41 -759,19 +759,40 @@@
                       'XObject': self.XObjectObject,
                       'ExtGState': self._extGStateObject,
                       'Pattern': self.hatchObject,
                       'Shading': self.gouraudObject,
                       'ProcSet': procsets}
          self.writeObject(self.resourceObject, resources)
  
      fontNames = _api.deprecated("3.11")(property(lambda self: self._fontNames))
-     type1Descriptors = _api.deprecated("3.11")(
-         property(lambda self: self._type1Descriptors))
 -    dviFontInfo = _api.deprecated("3.11")(property(lambda self: self._dviFontInfo))
+     type1Descriptors = _api.deprecated("3.11")(property(lambda _: {}))
  
 +    @_api.deprecated("3.11")
 +    @property
 +    def dviFontInfo(self):
 +        d = {}
 +        tex_font_map = dviread.PsfontsMap(dviread.find_tex_file('pdftex.map'))
 +        for pdfname, dvifont in self._dviFontInfo.items():
 +            psfont = tex_font_map[dvifont.texname]
 +            if psfont.filename is None:
 +                raise ValueError(
 +                    "No usable font file found for {} (TeX: {}); "
 +                    "the font may lack a Type-1 version"
 +                    .format(psfont.psname, dvifont.texname))
 +            d[dvifont.texname] = types.SimpleNamespace(
 +                dvifont=dvifont,
 +                pdfname=pdfname,
 +                fontfile=psfont.filename,
 +                basefont=psfont.psname,
 +                encodingfile=psfont.encoding,
 +                effects=psfont.effects,
 +            )
 +        return d
 +
      def newPage(self, width, height):
          self.endStream()
  
          self.width, self.height = width, height
          contentObject = self.reserveObject('page contents')
          annotsObject = self.reserveObject('annotations')
          thePage = {'Type': Name('Page'),
                     'Parent': self.pagesObject,
@@@ -986,72 -993,74 +994,78 @@@
          fontdict = {'Type': Name('Font'),
                      'Subtype': Name('Type1'),
                      'BaseFont': Name(fontname),
                      'Encoding': Name('WinAnsiEncoding')}
          fontdictObject = self.reserveObject('font dictionary')
          self.writeObject(fontdictObject, fontdict)
          return fontdictObject
  
 -    def _embedTeXFont(self, fontinfo):
 -        _log.debug('Embedding TeX font %s - fontinfo=%s',
 -                   fontinfo.dvifont.texname, fontinfo.__dict__)
 +    def _embedTeXFont(self, dvifont):
 +        tex_font_map = dviread.PsfontsMap(dviread.find_tex_file('pdftex.map'))
 +        psfont = tex_font_map[dvifont.texname]
 +        if psfont.filename is None:
 +            raise ValueError(
 +                "No usable font file found for {} (TeX: {}); "
 +                "the font may lack a Type-1 version"
 +                .format(psfont.psname, dvifont.texname))
  
-         # Widths
-         widthsObject = self.reserveObject('font widths')
-         tfm = dvifont._tfm
-         # convert from TeX's 12.20 representation to 1/1000 text space units.
-         widths = [(1000 * metrics.tex_width) >> 20
-                   if (metrics := tfm.get_metrics(char)) else 0
-                   for char in range(max(tfm._glyph_metrics, default=-1) + 1)]
-         self.writeObject(widthsObject, widths)
- 
-         # Font dictionary
+         # The font dictionary is the top-level object describing a font
          fontdictObject = self.reserveObject('font dictionary')
          fontdict = {
              'Type':      Name('Font'),
              'Subtype':   Name('Type1'),
-             'FirstChar': 0,
-             'LastChar':  len(widths) - 1,
-             'Widths':    widthsObject,
-             }
+         }
  
-         # Encoding (if needed)
-         if psfont.encoding is not None:
-             fontdict['Encoding'] = {
-                 'Type': Name('Encoding'),
-                 'Differences': [
-                     0, *map(Name, dviread._parse_enc(psfont.encoding))],
-             }
- 
-         # We have a font file to embed - read it in and apply any effects
+         # Read the font file and apply any encoding changes and effects
 -        t1font = _type1font.Type1Font(fontinfo.fontfile)
 -        if fontinfo.encodingfile is not None:
 +        t1font = _type1font.Type1Font(psfont.filename)
++        if psfont.encoding is not None:
+             t1font = t1font.with_encoding(
 -                {i: c for i, c in enumerate(dviread._parse_enc(fontinfo.encodingfile))}
++                {i: c for i, c in enumerate(dviread._parse_enc(psfont.encoding))}
+             )
 -        if fontinfo.effects:
 -            t1font = t1font.transform(fontinfo.effects)
 +        if psfont.effects:
 +            t1font = t1font.transform(psfont.effects)
+ 
+         # Reduce the font to only the glyphs used in the document, get the encoding
+         # for that subset, and compute various properties based on the encoding.
 -        chars = frozenset(self._character_tracker.used[fontinfo.dvifont.fname])
++        chars = frozenset(self._character_tracker.used[dvifont.fname])
+         t1font = t1font.subset(chars, self._get_subset_prefix(chars))
          fontdict['BaseFont'] = Name(t1font.prop['FontName'])
+         # createType1Descriptor writes the font data as a side effect
+         fontdict['FontDescriptor'] = self.createType1Descriptor(t1font)
+         encoding = t1font.prop['Encoding']
+         fontdict['Encoding'] = self._generate_encoding(encoding)
+         fc = fontdict['FirstChar'] = min(encoding.keys(), default=0)
+         lc = fontdict['LastChar'] = max(encoding.keys(), default=255)
  
-         # Font descriptors may be shared between differently encoded
-         # Type-1 fonts, so only create a new descriptor if there is no
-         # existing descriptor for this font.
-         effects = (psfont.effects.get('slant', 0.0),
-                    psfont.effects.get('extend', 1.0))
-         fontdesc = self._type1Descriptors.get((psfont.filename, effects))
-         if fontdesc is None:
-             fontdesc = self._type1Descriptors[psfont.filename, effects] = \
-                 self.createType1Descriptor(t1font)
-         fontdict['FontDescriptor'] = fontdesc
- 
+         # Convert glyph widths from TeX 12.20 fixed point to 1/1000 text space units
 -        tfm = fontinfo.dvifont._tfm
++        tfm = dvifont._tfm
+         widths = [(1000 * metrics.tex_width) >> 20
+                   if (metrics := tfm.get_metrics(char)) else 0
+                   for char in range(fc, lc + 1)]
+         fontdict['Widths'] = widthsObject = self.reserveObject('glyph widths')
+         self.writeObject(widthsObject, widths)
 -
          self.writeObject(fontdictObject, fontdict)
          return fontdictObject
  
+ 
+     def _generate_encoding(self, encoding):
+         prev = -2
+         result = []
+         for code, name in sorted(encoding.items()):
+             if code != prev + 1:
+                 result.append(code)
+             prev = code
+             result.append(Name(name))
+         return {
+             'Type': Name('Encoding'),
+             'Differences': result
+         }
+ 
+ 
      @_api.delete_parameter("3.11", "fontfile")
      def createType1Descriptor(self, t1font, fontfile=None):
          # Create and write the font descriptor and the font file
          # of a Type-1 font
          fontdescObject = self.reserveObject('font descriptor')
          fontfileObject = self.reserveObject('font file')
  
          italic_angle = t1font.prop['ItalicAngle']

@jkseppan
Copy link
Member

Or, easier to read after a rebase: jkseppan:simple-dvifontinfo-rebase, commit 9a8c078.

@anntzer anntzer force-pushed the simple-dvifontinfo branch from d7499b9 to 272dda0 Compare June 24, 2025 16:18
@anntzer
Copy link
Contributor Author

anntzer commented Jun 24, 2025

Thanks for providing the rebase.

@jkseppan
Copy link
Member

The appveyor failure looks like a timeout in the final phases, but the actual tests passed. The coverage decrease is because the deprecated dviFontInfo property does not get exercised in tests. I'm not convinced that it needs a test; feel free to self-merge either as is or with an additional test for that.

- Use a simpler deterministic mapping of tex font names to pdf embedding
  names.
- Only resolve the required attributes when needed (in _embedTeXFont),
  which avoids e.g. having to carry around and worry about attributes
  with different names (e.g. "encoding" vs. "encodingfile").
@anntzer anntzer force-pushed the simple-dvifontinfo branch from 272dda0 to e3c7a64 Compare June 24, 2025 20:49
@anntzer
Copy link
Contributor Author

anntzer commented Jun 24, 2025

looks like everything got randomly fixed after a minor cleanup removing extra whitespace.

@QuLogic QuLogic merged commit fed8c20 into matplotlib:main Jun 25, 2025
40 checks passed
@github-project-automation github-project-automation bot moved this from Ready for Review to Done in Font and text overhaul Jun 25, 2025
@anntzer anntzer deleted the simple-dvifontinfo branch June 25, 2025 05:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

3 participants