Inspired by Sebastian Ledesma's (@sebas_ledesma) work on new syntax highlighting parsers [discussion:d005538c39][feature-requests:#250][feature-requests:#251], I started reviewing and improving the relevant TCoolEdit code in the CoolPrj code base (see [discussion:d005538c39#b357]).
Although this was initially only exploratory and experimental, I have decided to submit my changes, as requested by Sebastian. My further support may be limited, but hopefully my changes will benefit him and others in using, maintaining and further developing the neat syntax highlighting feature.
Changes I have made so far [r7154]:
Edit: Additional refinements in this overhaul:
Bugs: #285
Bugs: #525
Bugs: #526
Bugs: #551
Bugs: #553
Bugs: #554
Bugs: #555
Bugs: #563
Bugs: #580
Bugs: #581
Bugs: #582
Bugs: #583
Bugs: #588
Commit: [r7151]
Commit: [r7154]
Commit: [r7155]
Commit: [r7156]
Commit: [r7157]
Commit: [r7163]
Commit: [r7165]
Commit: [r7167]
Commit: [r7175]
Commit: [r7176]
Commit: [r7180]
Commit: [r7181]
Commit: [r7187]
Commit: [r7188]
Commit: [r7191]
Commit: [r7197]
Commit: [r7200]
Commit: [r7205]
Commit: [r7207]
Commit: [r7209]
Commit: [r7211]
Commit: [r7214]
Commit: [r7226]
Commit: [r7282]
Commit: [r7289]
Commit: [r7381]
Commit: [r7408]
Commit: [r7427]
Commit: [r7428]
Discussion: CoolEdit issues
Discussion: CoolEdit issues
Discussion: d005538c39
Discussion: CoolEdit in C++Builder 10.x
Discussion: CoolEdit in C++Builder 10.x
Discussion: CoolEdit in C++Builder 10.x
Feature Requests: #250
Feature Requests: #251
Feature Requests: #253
Feature Requests: #254
Feature Requests: #266
Feature Requests: #268
Feature Requests: #269
Wiki: Coding_Standards
Wiki: OWLNext_Roadmap_and_Prereleases
Anonymous
BUG: CoolPrj: TRsrcSyntaxParser does not parse lines of length 1 [r7184].
Also, the TRsrcSyntaxParser does not recognise ACCELERATORS, STRINGTABLE and TEXTINCLUDE as keywords.
Looking at the implementation it looks suspiciously like the old C++ parser implementation, just with a different keyword list. I've now tried to copy over the new cleaned up C++ parser implementation, just with the keyword list changed (including adding the omissions mentioned above), and it seems to work well.
I have now committed this new RC parser [r7185], but note that we ideally should get rid of this duplication of code by refactoring the code for reuse between the parsers, e.g. by using a common base class that has the keyword list as a compile-time parameter. I have left that as a
\todoitem in the file documentation.Edit: This has now been implemented. A base class TCppParserBase for TCppParser and TRcParser has been added, for code reuse [r7214].
Related
Commit: [r7184]
Commit: [r7185]
Commit: [r7214]
Last edit: Vidar Hasfjord 2024-11-16
BUG:TCoolEdit renders multiline syntax elements incorrectly [bugs:#583].
In the new code, an initialisation order bug affecting this issue was fixed in [r7186], but this bug isn't the only cause of the issue, since the issue applies to the old code as well, and fixing the bug in the new code doesn't solve the issue.
The wider problem seems to be with the way cookies are handled in TCoolTextWnd. Notably, in DrawLine, a line is not parsed if it is empty. Changing the code so that empty lines are parsed seems to fix the issue, but the question remains: why?
GetParserCookie should give us the correct cookie for the preceding line, even if the preceding line is empty, since GetParserCookie will call ParseLine for any line that needs its cookie generated. It doesn't skip empty lines, and it will even iterate over the whole file, if necessary.
I suspect the issue is that the cookies are not properly updated as lines are edited. Editing may invalidate the cookie container, either if the change affects the rendering of the following line (as is the case for extended comments), or if you add or remove lines (in which case the line indexes change, thereby invalidating the cookie container from that line onwards) . I'm not sure this is properly handled, if at all. I tried saving and reopening the test file, thereby regenerating the cookies, and after that the extended comment did indeed render properly, indicating that the issue lies with stale cookies, as suspected.
Edit: This issue has now been fixed. See [bugs:#583] and [r7189][r7192].
Related
Bugs: #583
Commit: [r7186]
Commit: [r7189]
Commit: [r7192]
Last edit: Vidar Hasfjord 2024-11-16
With the addition of TCppParserBase (in "lang.h") introduced in [r7214], it is now easier than ever to create a new parser based on a C-like language, provided that it (more or less) shares the same tokens as C++, including number, string and character literals, escape sequences, line continuation (line ending with backslash), "#..." preprocessor directives, "//..." line comments, and "/*...*/" extended comments. You can just derive from TCppParserBase and override virtual function IsKeyword to indicate the words that should be highlighted as keywords.
TCppParser and TRcParser now do just that.
However, in [r7214], I did more changes as well. A major change was to effectively retire TParserSelector from the API. It is still alive, but it is now just a bloated implementation detail in "lang.cpp", hence given a trailing underscore (TParserSelector_) to indicate that it is not part of the library API. However, it is far too elaborate in its current role; the implementation can do without it and be much simplified. On the other hand, I've left it intact, in case there is demand for bringing it back to the API.
Just to clarify, its purpose is to simplify parser selection for clients that provide their own, in addition to, or in place of, the built-in parsers. By passing true (the default) for initDefTypes in the constructor, the class will initialise its list of parsers from the built-in list. If false is passed, it will start empty. Then the apparent idea is that the user should call AddParser to add custom parsers, by passing a parser factory function for the parser, along with the appropriate file filter pattern (which is a string, listing filters separated by semicolon). When this is all done, the client can then call CreateParser, passing a file path, and have the appropriate parser for that file type returned, if any.
All that said, I suspect this is overkill for the few (quite possibly zero) applications that implement a custom parser. I bet they are likely to just test the file type themselves and simply create their custom parser directly using its contructor, rather than go through all these hoops to use the TParserSelector functionality. And to use the built-in parsers, they'll just simply call ChooseSyntaxParser (declared in "lang.h").
Okay, that was my long awkward exposition. Now, do you want TParserSelector back or not?
If not, I may further simplify "lang.cpp" by sending TParserSelector_ into the woods.
Edit: TParserSelector_ is no more [r7226].
Related
Commit: [r7214]
Commit: [r7226]
Last edit: Vidar Hasfjord 2024-10-27
The original C++ parser did a lousy job with numbers. This has been much improved in [r7226]. Unlike the old, the revised implementation properly supports scientific format, digit separators and suffixes (built-in and user-defined).
Here is a test screenshot; new on the left, old on the right:
Please review and test!
Related
Commit: [r7226]
Last edit: Vidar Hasfjord 2025-09-11
In [r7289], I overhauled the TEditPropDlg class for the syntax highlighting configuration dialog. In the following screenshot the old dialog is on the left, and the revamped dialog is on the right.
A colour sample along with the colour name is now shown for the items in the colour drop-down boxes.
In addition to the drop-down lists, I have added detail buttons that open the common colour dialog, so that any custom colour can be selected. After selecting a custom colour, it will appear in the drop-down list with the name "Custom".
The default colour is shown in the drop-down lists by appending "(Default)" to the colour name. The name of the currently active colour is shown in bold font. This is the "current" colour that was active when you opened the dialog, or since you switched category. That is, if you change the selected colour from the current colour and then switch to a different category, then if you return to the same category, your selected colour is now the current colour shown in bold.
The dialog has generally been enlarged to better accommodate long font and colour names. And the dialog now uses the operating system message font (which is slightly larger than the old Tahoma dialog font).
Category items "Normal Text" and "Whitespace" have swapped places, so that the former appears first in the list, for clarity and consistency. Note also that the font name and size fields are now disabled for other categories than normal text, since these base font settings are shared. Font variants bold, italic and underline can be individually set for each category, though.
Note:
There is still a lingering bug in the colour drop-down boxes, where sometimes multiple custom colours are listed at the bottom of the list (max two should be shown; the selected custom colour, as well as the default colour, if that too is a custom colour not in the list of named colours). If you spot the issue, let me know.Other than that, the dialog is fully functional now — after the bug fix in OWLMaker [bugs:#588] font changes also work, and are restored properly from the configuration file.Edit: The bug has been found and fixed [r7311].
Related
Bugs:
#588Commit: [r7289]
Commit: [r7311]
Last edit: Vidar Hasfjord 2025-08-29
In [r7381], support for coding fonts with ligatures was added. For example, when using the font Cascadia Code, the double-character operator token "!=" will be rendered as a double-width not-equal operator "≠", the double-character arrow token "->" will be rendered as a double-width right-arrow "→", and a series of dashes "----" will be rendered as a solid line "⎯⎯". Other double-character tokens, like comparison, increment and decrement operators, are also rendered as double-width single characters.
In the screenshot below, Cascadia Code is used in the first window from the top, and Cascadia Mono (without ligatures) is used in the second window.
Related
Commit: [r7381]
Last edit: Vidar Hasfjord 2025-08-29
After all the fancy refactoring I've done for CoolEdit — rewrites using nice modern C++, while simplifying the code, correcting bugs and eliminating old inefficiencies and workarounds — you would hope to see some performance gains, right?
Me too. I was very interested to see how my changes have affected the performance. So I added a little benchmark to our CoolDemo example. Alas, any hopes of performance gains were dashed. The new code is several times slower than the old!
In the screenshot above, the results for the old code are on the left, and the results for the new code are on the right. The results on the top are for the first run (cold cache), while the results on the bottom are the best results after a few reruns.
Notable is the much slower reading speed from file. The new code is 16-20 times slower. The main reason for this is that I replaced the old reading code by (notoriously slow) standard streams to enable support for Unicode. We now read and write all the UTF encoding formats, and with any line ending convention. However, this is where most gains can be had by optimising the code.
Sadly, the use of std::map and std::unordered_map for editor data structures, rather than the old raw arrays, has also affected performance negatively. Looking at the parsing and rendering speed (paging up to the top through the whole file), the new code is about 2.5 times slower. For text insertion (paste), the new code is 5 times slower. And for jumping around, the new code is 6-7 times slower.(Edit: It turns out the code performs well. The slowdown is all to do with replacing ExtTextOut by DrawTextEx. See my later post.)
However, note that the new code includes corrections, such as refreshing the text correctly, which the old code didn't do properly. This means the new code may do significantly more work for correctness. Nevertheless, there is opportunity for a lot of optimisation here.
Last edit: Vidar Hasfjord 2025-08-29
In my last post, I discussed performance, and I was concerned I would have to do some optimisation work to regain speed on the trunk (version 8), as compared to the old code (version 7). However, it turns out the major performance degradation in paging speed was all down to the replacement of ExtTextOut by DrawTextEx to get support for font ligatures (see earlier post).
Notably, if you do the same replacement on "branches/7-coolprj-dev" (development branch based on the old code with some fixes and enhancements, including the slow Unicode file operations merged from the trunk), the trunk version is actually faster.
See results below.
CoolEdit Benchmark Results
File: https://sourceforge.net/p/owlnext/code/8323/tree/trunk/source/coolprj/cooledit.cpp (5141 lines)
CoolDemo 7.0.19.8313-prerelease (branches/7-coolprj-dev)
Note: TCoolEdit was modified to have no extra line spacing (LineSpacingPercent = 0), just like the trunk version, to make tests comparable.
The whole benchmark took 1045 ms.
ExtTextOut replaced by DrawTextEx
Note: This levels the playing field, since the trunk version also uses DrawTextEx.
The whole benchmark took 2266 ms.
CoolDemo 8.0.5.8323-prerelease (trunk)
The whole benchmark took 2121 ms.
This makes the trunk version ~7% faster in this test.
Last edit: Vidar Hasfjord 2025-08-15
Note that the particular benchmark in my last post is kind to the old code, since the test file is a C++ file, and "branches/7-coolprj-dev" has merged the new and improved C++ parser from the trunk. The new parser creates far fewer tokens than the old parser.
When running benchmarks using old vs new parsers, the performance difference is much larger than shown previously. For example, see benchmark results below using the old resource file parser, which is identical to the old C++ parser except for different keywords.
CoolEdit Benchmark Results
File: https://sourceforge.net/p/lima-vva/code/2303/tree/trunk/Lima.rc (1860 lines)
CoolDemo 7.0.19.8313-prerelease (branches/7-coolprj-dev)
Note: ExtTextOut was replaced by DrawTextEx, to match the trunk version and level the playing field.
The whole benchmark took 1669 ms.
CoolDemo 8.0.5.8323-prerelease (trunk)
Note: The font size was increased to match the extra line spacing used by the version on "branches/7-coolprj-dev". This means that this test is slightly handicapped by having to draw more pixels.
The whole benchmark took 988 ms.
This makes the trunk version ~59% faster in this test.
Last edit: Vidar Hasfjord 2025-08-15
Great! This could make a noticeable difference even on fast systems.
@elotter wrote:
Thanks. It was a great relief to find that my code didn't cause unnecessary slowdown.
Now, it remains to find a way to optimise DrawTextEx, or use it only when needed (for fonts that have coding ligatures). DrawTextEx has a lot of options, so maybe just finding the right flag to set is all it takes to get a nice speed bump.
(Edit: Optimisations have now been made [r8334].)
PS. Optimising the file operations should be done as well (replacing slow streams).
Related
Commit: [r8334]
Last edit: Vidar Hasfjord 2025-08-17
Diff:
Related
Bugs: #285
Bugs: #525
Bugs: #526
Bugs: #551
Bugs: #553
Commit: [r7165]
Commit: [r7191]