Code::Blocks / Tickets / #224 Duplication in documentation tooltip

fenugrec - 2015-09-24

Tested on SVN 10499.
Sample project :

duplitest.7z

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

ollydbg - 2015-09-25

labels: code complet --> CodeCompletion
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

ollydbg - 2015-09-25

Thanks for the report and the nice description, I can reproduce this bug.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

ollydbg - 2015-09-25

I know the reason why this happens, but I don't have a good idea to solve this.

First thing we need to know, is that a function Token is shared by both the h file and the c/cpp file.
This means, in your example, if you modify the c file, and press save button, the function Token "main()" is removed from the TokenTree, but the function Token "asdf()" is not removed, because the h file is not changed, and "asdf()" is shared by the h and c files.

Let's see document is added to a Token:

void TokenTree::AppendDocumentation(int tokenIdx, const wxString& doc) { wxString& newDoc = m_TokenDocumentationMap[tokenIdx]; if (newDoc == doc) // Do not duplicate return; newDoc += doc; // document could happens before and after the Token, we combine them newDoc.Shrink(); }

You see, the document is append to the old document, not overwrite the old one. This is for some cases that you may have both doxygen document in h and c/cpp files for a single Token. Here, since Token "asdf()" is not removed when you save the file, when parsing the c file, the document is append to the old one.

The bug doesn't happens for the Token "main()", because this Token(and its document) is totally removed when you save the file, and later added again when parsing the c file.

Do you know how to fix this issue? Even in some case like below, we may need to combine the document:

/** document part 1 */ int a; ///< document part 2

Last edit: ollydbg 2015-09-25
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

fenugrec - 2015-09-25

If I know how to fix the issue ? No, definitely not p-) . I know nothing of wxwidgets, c++ or c::b internals , sorry !
I can think of no clean solution except maintaining a list of "sources" for documentation. In other words, every token needs a reference to all the files (.c, .h, others ?) that cause an ::AppenDocumentation() call. Therefore, on save, all the file's tokens are parsed and either removed (like main() ) or "refreshed" like asdf()... if that makes any sense.

Either that, or make the token's "documentation" a linked-list ? (one element per documentation source), and (re)creating the newDoc string completely when required (reparse / save / other conditions?)

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- ollydbg - 2015-09-25
  
  In other words, every token needs a reference to all the files (.c, .h, others ?) that cause an ::AppenDocumentation() call. Therefore, on save, all the file's tokens are parsed and either removed (like main() ) or "refreshed" like asdf()... if that makes any sense.
  
  The idea is that if the c file is reparsed, we need to remove the document collected from the c file, and keep the document from the h file. This is quite complex, since we now only use a single wxString to store all the document for a single Token
  
  Either that, or make the token's "documentation" a linked-list ? (one element per documentation source), and (re)creating the newDoc string completely when required (reparse / save / other conditions?)
  
  Yeah, this will change a single "wxString" to a "map: source file to wxString", when a file is removed, we should clear only one element of the map container.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

ollydbg - 2015-09-26

I have an idea to solve this issue.

First, we don't need the map:

/** Map: token index -> documentation */ TokenIdxStringMap m_TokenDocumentationMap;

Then, we need to add two field in the Token class:

wxString m_DocForDeclaration; wxString m_DocForDefinition;

And we store the document separately if both a function Token has doxygen style document in both header files and cpp files.

If we reparse the cpp file, we need to clear the m_DocForDefinition, if we reparse the header file, we clear the m_DocForDeclaration.

wxString TokenTree::GetDocumentation(int tokenIdx)

This function return the m_DocForDeclaration+m_DocForDefinition.

BTW: I remember that when the doxygen parser is introduced, I recommanded not add new field to the Token, because the Token is too big, so the author Profile of p2rkw try to add a map(m_TokenDocumentationMap) which map from the token index to the document string.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

ollydbg - 2015-09-26

status: open --> fixed

assigned_to: ollydbg

Milestone: Undefined --> Next Nightly
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

ollydbg - 2015-09-26

OK, this bug is fixed in trunk now(rev 10503). I can either wait for the next nightly build, or build from trunk yourself.

I will close this bug report, and it the issue still happens from your side, please report it here. Thanks.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

White-Tiger - 2015-09-28

wow this is awesome. I also thought about trying to fix this... but it really seemed to be not that easy (as you've also mentioned)

Hope this also fixes a similiar case were you don't have to modify any file to get duplicated doygens (guess caused by including the header with doxygen comments from within multiple files)

sadly I don't have much time to recompile Code::Blocks and try things, so might take a while before I test it.

edit: interestingly, you'll see from my screenshot that the "Returns" part isn't even complete.. weird. Not sure what that is about. Problably the lower than because of HTML? damn^^

Last edit: White-Tiger 2015-09-28

shot00735.png

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- ollydbg - 2015-10-04
  
  Hope this also fixes a similiar case were you don't have to modify any file to get duplicated doygens (guess caused by including the header with doxygen comments from within multiple files)
  
  Hi, White-Tiger, the above is not correct, because if a file(e.g. a header file) is parsed, then CC won't parsed again(unless it is modified), even though the header file are included in many cpp files, it only parsed from the first translation file which include it. This design is for performance.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

White-Tiger - 2015-10-01

Just confirming that the patch seems to work so far (though I've been able to get duplicates while editing a file without saving it. But saving fixes it, so it is ok for me)

But my issue from my previous comment / picture still exists and seem to be easily reproducable. (sry for hijacking this issue :P Yet it is duplication and thus related)

#ifndef CB_DOXYTEST_H_ #define CB_DOXYTEST_H_ typedef struct MyStructTAG { int random_struct_member; /**< with random description */ } MyStruct; /** * \brief does something with \p ms * \param[in,out] ms * \return boolean * \remark lalalalalala */ int DoSomethingWithMyStruct(MyStruct ms); #endif // CB_DOXYTEST_H_

(I suspect the issue being the "typedef struct ASDTAG {} ASD;" because if I do a forward declaration like "typedef struct ASD ASD;" and then implement the struct "struct ASD {};" it doesn't happen)

P.S. yes, I've modified my Doxygen parser to support more Doxygen rules. So yours will look differently

Last edit: White-Tiger 2015-10-01

shot00737.png
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

ollydbg - 2015-10-04

Sorry, I see too many issues you reported here, I read your last two posts several times, but I can't figure what exact they are, can you create specify ticket for specify bugs? Thanks.

1, "<" or ">" inside the doxygen cause the rendered html wrong
2, what do you mean by "typedef struct ASDTAG {} ASD;"?, can you tell me the excat steps how to reproduce this bug? In my test, it looks like the document always get dupolicated.(In the latest trunk) I have already turn off the setting "parsing while editing" in CC options.

Thanks.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

White-Tiger - 2015-10-04

forget about 1), that was just a sidenote since I had overlooked the fact that it is actually expected behavior. We can use HTML there (and sometimes have to) so "<" and ">" can't be used raw... One could only "fix" that by fully parsing HTML and ignoring weird input... eg. if "<" doesn't follow a known HTML tag or we'll get multiple attributes without values etc. But that is going a bit too far.

2) If you are able to reproduce the duplication using my code above, I don't know what you want from me... If you were not able to reproduce, it might be related to my local doxygen changes (though they've mostly added new keywords and partially messed with the keyword parser)
And yes, I'm talking about the duplication to always occur. Further tesst seem to show that all that is needed is a "typedef struct XXX {} XXX;", doesn't even matter if the tag/struct name is the same as the type name or not. Not sure what's going on there... And it doesn't have to be a function that follows, anything goes.

This simplified example also works (or doesn't work xD?) for me:

:::c++ typedef struct A { int i; } A; /** I'm duplicating ;) */ int bugged_doxygen_here_; /**< I'll work fine... */
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

ollydbg - 2015-10-04

I reopen this bug

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

ollydbg - 2015-10-04

status: fixed --> open
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

ollydbg - 2015-10-04

orget about 1), that was just a sidenote since I had overlooked the fact that it is actually expected behavior. We can use HTML there (and sometimes have to) so "<" and ">" can't be used raw... One could only "fix" that by fully parsing HTML and ignoring weird input... eg. if "<" doesn't follow a known HTML tag or we'll get multiple attributes without values etc. But that is going a bit too far.

OK, I think this should be another bug report, thanks.

2) If you are able to reproduce the duplication using my code above, I don't know what you want from me... If you were not able to reproduce, it might be related to my local doxygen changes (though they've mostly added new keywords and partially messed with the keyword parser)
And yes, I'm talking about the duplication to always occur. Further tesst seem to show that all that is needed is a "typedef struct XXX {} XXX;", doesn't even matter if the tag/struct name is the same as the type name or not. Not sure what's going on there... And it doesn't have to be a function that follows, anything goes.

OK, thanks for the report, and the test code, I will look into it, not sure how soon I can fix it. ^_^

Last edit: ollydbg 2015-10-04

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

ollydbg - 2015-10-04

It looks like the comment

/** I'm duplicating ;) */

were read twice, and the code:

m_NextTokenDoc = doc + m_NextTokenDoc;

are run twice.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

OK, I found it and have a simple patch to fix this issue:

 src/plugins/codecompletion/parser/tokenizer.cpp | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/src/plugins/codecompletion/parser/tokenizer.cpp b/src/plugins/codecompletion/parser/tokenizer.cpp
index d5dd30d..47019cc 100644
--- a/src/plugins/codecompletion/parser/tokenizer.cpp
+++ b/src/plugins/codecompletion/parser/tokenizer.cpp
@@ -913,6 +913,14 @@ void Tokenizer::UngetToken()
     m_NestLevel      = m_UndoNestLevel;
     m_PeekToken      = m_Token;
     m_PeekAvailable  = true;
+

+    // once the Token Index is moved back, we also need to clear the m_NextTokenDoc
+    // E.g.
+    //        token1  /** comment */ token2
+    //              ^new                   ^old
+    // if the Token Index was reset from position "old" to "new", if we do not reset the
+    // m_NextTokenDoc, we get duplicated "comment".
+    m_NextTokenDoc.Clear();
 }

 /* this function always start from the index of m_TokenIndex

Hope you understand the logic, and I hope it doesn't cause other issue(such as the document get lost dur to reset of m_NextTokenDoc.

Last edit: ollydbg 2015-10-04

Teodor Petrov - 2015-10-04

ollydbg: Do you have a way to automatically test the doxygen comments?
If not I suggest adding this to the test system. It will make your life a lot easier. :)

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- ollydbg - 2015-10-19
  
  I have add one: cctest: add automatically test for doxygen document reading
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Duplication in documentation tooltip

A free C, C++ and Fortran IDE

Milestone

Searches

Help

#224 Duplication in documentation tooltip

Discussion