To get straight to the point: in C/C++, given Pango indexes obtained from a function like pango_layout_xy_to_index, how am I supposed to find the corresponding position in the string I fed into the layout? I need this to be safe for UTF-8, ampersand-style special characters (see below), and formatting tags.
This is for the purpose of implementing a copy/cut/replace system for custom UI elements using Pango-Cairo for text rendering. So, essentially, I’m trying to find a reliable way to get the text between two click-points run through pango_layout_xy_to_index.
I ask primarily because I’m trying to re-implement some spaghetti I programmed almost a decade ago that is supposed to do this. There’s a special case claiming ampersand-style characters (stuff like nbsp) are counted as 1 character in Pango indexes, but obviously occupy several in the string given to Pango. Thus, I need to do conversion. I have zero confidence in any of this, but I wouldn’t have added that without a reason and it had been working (I think). What I don’t know is why Pango would be doing that, since the documentation seems to imply the index should also be the index into the original UTF-8 string given to Pango. Perhaps it’s something to do with GMarkup? It could even be an ancient bug that has since been fixed.
What I’m hoping for is an authoritative answer on how I’m supposed to do this, so I can avoid any more experimentation/guessing/kludges/bugs.
That said, I am aware Pango handles some extreme cases (like text going different directions within a line). If that makes a difference in the answer, let’s assume I only care about text going a uniform direction. I have enough problems with that.
I’m also aware there are going to be some gotchas with formatting tags. Right now, I’m only concerned about their presence not breaking my ability to locate character positions after they appear.
For the record, this is how I feed text into the layout:
pango_layout_set_markup( mPangoLayout, mText.c_str( ), -1 );