Evolution .pst import: hangs

Hi All

I try to import .pst files (from outlook). Import stops after a couple of mails (2 - 25) and the window freezes (can’t exit neither the import screen nor the application without killing from the terminal).

  • No messages, just a progress bar that stops.
  • The few imported mails can be selected and read.
  • .pst files are 400-500 MB

Do I miss something obvious?

Any help appreciated for this returning user - thanks in advance.

Hi,
what is your Evolution version, please?

Grab a backtrace of the evolution process in the frozen state, which
may show where precisely it has got stuck. You can get the backtrace
with a command like this:

   $ gdb --batch --ex "t a a bt" --pid=`pidof evolution` &>bt.txt

Please check the bt.txt for any private information, like passwords,
email addresses, server addresses,… I usually search for “pass” at
least (quotes for clarity only), before sharing it anywhere. Ideally
have installed debuginfo packages for the glib2, gtk3, libsoup3,
evolution-data-server and evolution at least, thus the backtrace is
usable. The package versions should precisely match the binary packages
versions.

Bye,
Milan

Hi Milan

thanks for the support!

evolution version: 3.52.3-0ubuntu1

Debuginfo packages: I installed

  • libglib2.0-bin-dbgsym
  • libsoup-3.0.0-dbsym
  • evolution-data-server-dbgsym
  • evolution-dbgsym
  • all debug versions of installed gtk3 libraries (I wasn’t sure which ones are relevant here)

I did not find debug packages for all libs.

Update to symptoms: evolution itself is not frozen, just the import.

debug output (with your command): file is too big. Can you advise me on the proper way to share? The lines from the thread with the pst import are below.

Thanks!
Matt

Thread 3 (Thread 0x77b7274006c0 (LWP 17836) "pool-evolution"):
#0  0x000077b7868b7ba0 in camel_html_parser_tag@plt () from /lib/x86_64-linux-gnu/libcamel-1.2.so.64
#1  0x000077b786977c91 in ?? () from /lib/x86_64-linux-gnu/libcamel-1.2.so.64
#2  0x000077b7869083e4 in camel_mime_filter_filter () from /lib/x86_64-linux-gnu/libcamel-1.2.so.64
#3  0x000077b7869578a6 in ?? () from /lib/x86_64-linux-gnu/libcamel-1.2.so.64
#4  0x000077b786951716 in camel_stream_write () from /lib/x86_64-linux-gnu/libcamel-1.2.so.64
#5  0x000077b7869521da in camel_stream_write_to_stream () from /lib/x86_64-linux-gnu/libcamel-1.2.so.64
#6  0x000077b7868caa42 in ?? () from /lib/x86_64-linux-gnu/libcamel-1.2.so.64
#7  0x000077b7868cae02 in camel_data_wrapper_write_to_stream_sync () from /lib/x86_64-linux-gnu/libcamel-1.2.so.64
#8  0x000077b786912ff0 in ?? () from /lib/x86_64-linux-gnu/libcamel-1.2.so.64
#9  0x000077b7868cae02 in camel_data_wrapper_write_to_stream_sync () from /lib/x86_64-linux-gnu/libcamel-1.2.so.64
#10 0x000077b786921c87 in ?? () from /lib/x86_64-linux-gnu/libcamel-1.2.so.64
#11 0x000077b7868cb272 in camel_data_wrapper_decode_to_stream_sync () from /lib/x86_64-linux-gnu/libcamel-1.2.so.64
#12 0x000077b78691532b in ?? () from /lib/x86_64-linux-gnu/libcamel-1.2.so.64
#13 0x000077b7868e1684 in ?? () from /lib/x86_64-linux-gnu/libcamel-1.2.so.64
#14 0x000077b7868e1726 in ?? () from /lib/x86_64-linux-gnu/libcamel-1.2.so.64
#15 0x000077b7868e133b in ?? () from /lib/x86_64-linux-gnu/libcamel-1.2.so.64
#16 0x000077b7711cdd5d in camel_imapx_server_append_message_sync (is=0x77b7340015c0, mailbox=0x77b6400ee970, summary=0x77b6400eba30, message_cache=0x77b6400f2360, message=0x77b640115eb0, mi=0x77b64011bd80, appended_uid=0x77b7273ff3f0, cancellable=0x77b64011be00, error=0x77b7273ff3e8) at /usr/src/evolution-data-server-3.52.3-0ubuntu1/src/camel/providers/imapx/camel-imapx-server.c:5144
#17 0x000077b7711b6364 in imapx_conn_manager_append_message_run_sync (job=0x77b640122540, server=0x77b7340015c0, cancellable=0x77b64011be00, error=0x77b7273ff460) at /usr/src/evolution-data-server-3.52.3-0ubuntu1/src/camel/providers/imapx/camel-imapx-conn-manager.c:2324
#18 0x000077b7711c1c79 in camel_imapx_job_run_sync (job=0x77b640122540, server=0x77b7340015c0, cancellable=0x0, error=0x77b7273ff520) at /usr/src/evolution-data-server-3.52.3-0ubuntu1/src/camel/providers/imapx/camel-imapx-job.c:474
#19 0x000077b7711b3dc0 in camel_imapx_conn_manager_run_job_sync (conn_man=conn_man@entry=0x621b037fdb50, job=job@entry=0x77b640122540, finish_before_job=finish_before_job@entry=0x0, cancellable=cancellable@entry=0x0, error=error@entry=0x0) at /usr/src/evolution-data-server-3.52.3-0ubuntu1/src/camel/providers/imapx/camel-imapx-conn-manager.c:1263
#20 0x000077b7711b5d04 in camel_imapx_conn_manager_append_message_sync (conn_man=conn_man@entry=0x621b037fdb50, mailbox=mailbox@entry=0x77b6400ee970, summary=0x77b6400eba30, message_cache=message_cache@entry=0x77b6400f2360, message=message@entry=0x77b640115eb0, mi=mi@entry=0x77b64011bd80, append_uid=0x0, cancellable=0x0, error=0x0) at /usr/src/evolution-data-server-3.52.3-0ubuntu1/src/camel/providers/imapx/camel-imapx-conn-manager.c:2365
#21 0x000077b7711b8ef6 in imapx_append_message_sync (folder=0x77b6400eb800, message=0x77b640115eb0, info=0x77b64011bd80, appended_uid=0x0, cancellable=0x0, error=0x0) at /usr/src/evolution-data-server-3.52.3-0ubuntu1/src/camel/providers/imapx/camel-imapx-folder.c:504
#22 0x000077b7868ea912 in camel_folder_append_message_sync () from /lib/x86_64-linux-gnu/libcamel-1.2.so.64
#23 0x000077b75f3e7016 in pst_process_email (item=<optimized out>, m=0x621b04121860) at /usr/src/evolution-3.52.3-0ubuntu1/src/plugins/pst-import/pst-importer.c:1381
#24 pst_process_item (previous_folder=<synthetic pointer>, d_ptr=<optimized out>, m=0x621b04121860) at /usr/src/evolution-3.52.3-0ubuntu1/src/plugins/pst-import/pst-importer.c:915
#25 pst_import_folders (topitem=0x77b64000a740, m=0x621b04121860) at /usr/src/evolution-3.52.3-0ubuntu1/src/plugins/pst-import/pst-importer.c:821
#26 pst_import_file (m=0x621b04121860) at /usr/src/evolution-3.52.3-0ubuntu1/src/plugins/pst-import/pst-importer.c:790
#27 pst_import_import (m=0x621b04121860, cancellable=<optimized out>, error=<optimized out>) at /usr/src/evolution-3.52.3-0ubuntu1/src/plugins/pst-import/pst-importer.c:701
#28 0x000077b778065294 in ?? () from /usr/lib/evolution/libemail-engine.so.0
#29 0x000077b7867bd542 in g_thread_pool_thread_proxy (data=<optimized out>) at ../../../glib/gthreadpool.c:336
#30 0x000077b7867b7c82 in g_thread_proxy (data=0x77b76c003be0) at ../../../glib/gthread.c:835
#31 0x000077b78049ca94 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:447
#32 0x000077b780529c3c in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:78

Hi,
these are (almost) perfect and sufficient. It looks like there are
missing libcamel debug symbols, but those minimal are good for now.

The backtrace shows that you are importing into an IMAP folder, the
code is adding a message there, and while it’s converting the memory
structure into a data stream, it has got stuck. Maybe it’s not really
stuck like doing nothing, I guess it busy-loops, aka it’s chewing one
core of your CPU.

Importing to a non-IMAP folder won’t help, it’s something in the
message triggering the problem, but maybe you can try it, just in case.
There had been no change in the code from the top of the thread
backtrace since your version, thus if the problem is solely in that
part of the code, then it is present in the latest code as well. You
have a good version too.

Is it possible the archive contains a very large HTML message, please?
I know of a weak point in the libcamel filtering code, very long lines
are not treated well (they use a lot of CPU and memory due to
reallocation of the buffers, because the code wants to read the whole
line first, but it reads the data in the small chunks - something like
that was that).

I’m afraid it’s not possible to extract the HTML part where this one is
stuck, to verify what it is and eventually do some testing with it
here, unless some very deep debugging things are involved.

Maybe, if you can check whether one core of the CPU has a high usage
while importing the file (indicating the code does something), then I’d
suggest to import to an On This Computer folder (create a new one for
it; use the local folder to not make the IMAP server wait), and then
let it run for some time. There is no progress indicator about total
messages being imported, I think, thus it’s not that easy to see
whether the import advanced to the next message (which may or may not
suffer of a similar problem), but the processed messages should be
visible in the destination folder. The folder can be frozen for the
updates during the import, in which case the imported message will be
shown only after the import is finished (successfully or cancelled),
the latest after evo’s restart.

I’m sorry I do not have anything better.

Bye,
Milan

Hi Milan

thanks a lot for the detailed answer. Even without a quick fix, I tested the ideas you mentioned:

  • 2-3 cores add to 100% usage (I don’t know what that means in practice but the curves when two are active are totally symetric).
  • importing to a new local folder shows the same symptoms.

I do see the imported messages. As they are in chronological order, I’ll check what the offending ones are.

One question: would it be useful to use a different file format (mbox, e.g.)?

Finally, there is one interesting thing: I first tried to load all my messages in one big .pst file (2 GB). It got stuck at a certain message (after about 3/4 of all messages, verified repeatedly to check whether it’s always the same place).

I then split the messages in various files, hoping those without the offending message would get loaded. To no avail…

Anyhow, I was just wondering what you meant by “very large HTML message”. Do you have a size indication for this?

Best
Matt

would it be useful to use a different file format (mbox, e.g.)?

I suppose you mean the format of the destination folder in the
Evolution. Is so, then no, the backtrace suggests that the pst-import
plugin successfully extracted a message content from the .pst file and
created an internal structure (object) with the data describing the
message content. The busy loop happens just in time of converting this
structure (object) into a data stream, which is passed to the server or
to the disk. This part cannot be skipped within the Evolution itself.

If you mean to not use the .pst format, but export the messages from
the Outlook in an eml/mbox format, then it might probably help (see
below).

Anyhow, I was just wondering what you meant by “very large HTML
message”. Do you have a size indication for this?

Say hundreds of kilobytes or more.

Do you know gdb a bit? It’s possible to know which exact message the
code is working on using gdb.

Start Evolution under gdb:

   gdb evolution --ex r

run the import and wait until it’s stuck/busy loop. Switch to the
terminal with the gdb and press Ctrl+C, which will open a gdb prompt.
Do:

   t a a bt

to find out which thread contains the
camel_imapx_server_append_message_sync and switch to it (say it’s
thread 3 like in your above backtrace):

   t 3

The backtrace also shows which frame the append_message_sync function
is at (the #16 above):

   f 16

then you can print information about the message with:

   p camel_mime_message_get_message_id (message)
   p camel_mime_message_get_subject (message)

then you can quit the gdb:

   q

As you face the same problem with multiple messages (.pst files), then
it can be also something with the way the pst-import extracts the data
from the .pst file. I would try to save this particular message in an
mbox/eml format from the Outlook and import it on its own. Maybe it’ll
work, though it’s likely the Outlook formats the internal data
differently than the pst import plugin.

What would help to the Evolution is to have a very simple .pst file,
with which I could reproduce the problem. It would be enough to have
there this single message and nothing else (supposing the size of the
.pst file/the number of messages saved in it, does not play any role in
the reproducer). It’s a private message, thus nothing to be shared
here, in public. I can give you my email address, if you are willing to
share it with me, for testing purposes only, in case you’d be able to
reproduce it with such simplified .pst file.

Another option is that you can avoid the Evolution’s .pst import and
use some other tools to extract the messages from the .pst file (the
.pst file can contain more than just messages) and import those
extracted files. Maybe you meant that with the other format, then yes,
I agree, that could work or at least it does worth a try.

Hi Milan

thanks a again for the detailed support. I’ll try your steps and in particular try to separate out an offending message.

Best
Matt