Discussion:
Impossible crashes
Cees De Groot
2006-07-19 15:18:02 UTC
Permalink
Hi,

We are getting reports from customers with stacktraces indicating all
sorts of impossibilities:

- nil elements in the ScheduledDelays collection (Delay class var);
- crashes in #becomeForward: which looks so nonsensical that it almost
looks like two of these calls are active at the same time;
and other stuff like that.

I was wondering - there's no way, I hope, that the wx threads can be
messing with the VM, is there? That suddenly two native threads start
interpreting or something like that? Because that is almost the only
thing that can explain this (random memory corruption is hardly likely
to cause these crashes - they've each been reported multiple times).
Andreas Raab
2006-07-19 15:18:02 UTC
Permalink
Very likely memory corruption is your problem (unless you use other FFI
or large C libraries). If you can run this single threaded (or put the
threads "under control" by the app) you should write-protect the Squeak
object-memory during those primitive calls. This will make the VM crash
at the point of failure instead of ten minutes later in a GC.

Cheers,
- Andreas
Post by Cees De Groot
Hi,
We are getting reports from customers with stacktraces indicating all
- nil elements in the ScheduledDelays collection (Delay class var);
- crashes in #becomeForward: which looks so nonsensical that it almost
looks like two of these calls are active at the same time;
and other stuff like that.
I was wondering - there's no way, I hope, that the wx threads can be
messing with the VM, is there? That suddenly two native threads start
interpreting or something like that? Because that is almost the only
thing that can explain this (random memory corruption is hardly likely
to cause these crashes - they've each been reported multiple times).
Cees De Groot
2006-07-19 15:18:02 UTC
Permalink
[...] write-protect the Squeak
object-memory during those primitive calls.
Is there an easy way to do that? I'm not really fluent in either the
VM nor Win32, so any pointer is welcome...

I've already considered to recompile the wx libraries and wxsqueak
with bounds checking, but I haven't found any free libraries for that
(nor could I get any conclusive info on what MSVC includes). And it
won't intercept all kinds of pointer mess-ups, of course.
Rob Gayvert
2006-07-19 15:18:02 UTC
Permalink
That sounds like a great idea. It looks like this can be accomplished in
the Windows VM using a couple of simple calls to VirtualProtect(). Is
that right? If so, I could slip this in as a debugging option.

.. Rob
Post by Andreas Raab
Very likely memory corruption is your problem (unless you use other
FFI or large C libraries). If you can run this single threaded (or put
the threads "under control" by the app) you should write-protect the
Squeak object-memory during those primitive calls. This will make the
VM crash at the point of failure instead of ten minutes later in a GC.
Cheers,
- Andreas
Post by Cees De Groot
Hi,
We are getting reports from customers with stacktraces indicating all
- nil elements in the ScheduledDelays collection (Delay class var);
- crashes in #becomeForward: which looks so nonsensical that it almost
looks like two of these calls are active at the same time;
and other stuff like that.
I was wondering - there's no way, I hope, that the wx threads can be
messing with the VM, is there? That suddenly two native threads start
interpreting or something like that? Because that is almost the only
thing that can explain this (random memory corruption is hardly likely
to cause these crashes - they've each been reported multiple times).
Andreas Raab
2006-07-19 15:18:02 UTC
Permalink
Post by Rob Gayvert
That sounds like a great idea. It looks like this can be accomplished in
the Windows VM using a couple of simple calls to VirtualProtect(). Is
that right? If so, I could slip this in as a debugging option.
Correct. You also need an exception handler that logs the relevant
information and proceeds (!) because otherwise some legitimate accesses
might be blocked (sockets, files etc). And you need to check with
sqWin32Memory.c to see that there is no conflict in the access modes.
(when I did it I changed it right alongside that code which might be an
option if you have a tailored VM)

Cheers,
- Andreas

Loading...