bul
February 3, 2025, 1:26pm
1
bonjour à tou[te]s
désolé pour mon anglais
j’écris grâce à https://www.reverso.net/traduction-texte
apply_tag disturbed by accents
example :
using Gtk;
/*
valac --pkg gtk4 --pkg gtksourceview-5 g.vala && ./g
apply_tag sur GtkSource.Buffer questions si accents ( ou autres )
*/
int main(string[] argv) {
Gtk.Application app=new Gtk.Application(null,GLib.ApplicationFlags.HANDLES_OPEN);
app.activate.connect(() => {
Gtk.ApplicationWindow window=new Gtk.ApplicationWindow(app);
window.set_default_size(640,320);
GtkSource.View src=new GtkSource.View();
Gtk.TextTagTable tag=new Gtk.TextTagTable();
Gtk.TextTag tagl=new Gtk.TextTag("link");
tagl.set_property("underline",Pango.Underline.SINGLE);
tagl.foreground="#F6624A";
tag.add(tagl);
GtkSource.Buffer buf=new GtkSource.Buffer(tag);
buf.text= "\n aa https://www.aa"+
"\n bb http://www.bb"+
"\n éé https://www.cc"+
"\n dd https://www.dd"+
"\n ee http://www.ee";
src.set_buffer(buf);
window.set_child(src);
window.present();
Gtk.TextIter deb,fin;
int ts,te;
buf.get_start_iter(out deb);
buf.get_end_iter(out fin);
MatchInfo trv;
try { Regex url=new Regex("https?://[a-zA-Z0-9=?./-]+");
if ( url.match(buf.get_text(deb,fin,false),0,out trv) ) {
do { trv.fetch_pos(0,out ts,out te);
Gtk.TextIter start;
buf.get_iter_at_offset(out start,ts);
Gtk.TextIter end;
buf.get_iter_at_offset(out end,te);
print(buf.get_text(deb,fin,false).substring(ts,te-ts)+"\n");
buf.apply_tag(tagl,start,end);
} while ( trv.next() );
}
} catch ( Error e ) {
}
});
return app.run(argv);
}
it sounds but, after the accents, the 'link" mark is offset from the number of accented characters
I make a mistake where?
thank you in advance
oubl i : linux manjaro, gtk 4, gtksource 5
Hi,
fetch_pos()
will give you a position as pointer (i.e. byte index), while get_iter_at_offset()
takes a character position (i.e. independent of the underlying size of each character).
Accented characters line “é” are encoded in multiple bytes (2 bytes in UTF-8), so the following character will have a +1 character offset but a +2 byte position.
You will need to convert the byte indexes te
and ts
into offsets using functions like GLib.utf8_pointer_to_offset
bul
February 4, 2025, 12:46pm
3
thank you very much
I suspected well utf8 !
It remains to be seen how to use GLib.utf8_pointer_to_offset
with vala, because the doc is more than succinus!
@+
1 Like
bul
February 5, 2025, 8:58am
4
finally, with difficulty, a sequence that works,
even if I’m sure it’s not the best solution
try { Regex url=new Regex("https?://[a-zA-Z0-9=?./-]+");
string txt=buf.get_text(deb,fin,false);
if ( url.match(txt,0,out trv) ) {
do { trv.fetch_pos(0,out ts,out te);
int ts1=txt.index_of_nth_char(ts);
ts+=(ts-ts1);
int te1=txt.index_of_nth_char(te);
te+=(te-te1);
buf.get_iter_at_offset(out start,ts);
buf.get_iter_at_offset(out end,te);
buf.apply_tag(tagl,start,end);
} while ( trv.next() );
}
} catch ( Error e ) {
}
again thank you
bul
February 5, 2025, 1:16pm
5
oups…
characters like ╭ ┬ ┼ ├ disturb !
( there must be others )
not simple vala and gtk4
bul:
txt.index_of_nth_char
No, the other way around: you already have an index, and now you need an offset.
Have you tried string.pointer_to_offset – glib-2.0 ?
Note that I’m not sure these pointer games are really possible from vala… Language bindings usually don’t like that. In the worst case, you may have to implement a small C file with a helper function, and make valac
compile and link it with your vala code.
bul
February 6, 2025, 7:25am
7
`string.pointer_to_offset’ is deprecated. Use string.char_count
I followed the advice
I will dig deeper
bul
February 11, 2025, 10:04am
8
I had time to see again
and using "char_count() " it works
@+
1 Like
system
(system)
Closed
March 13, 2025, 10:04am
9
This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.