Re: Crash Heap in NTDLL when server running
No. I can't give solutions to problems I don't know the answer too... I can only suggest where to look for solutions.
If *you* don't understand, that's not my problem. If Zahara and 35hit don't understand me, I'm sure they will ask me to clarify a section they don't know how to follow.
Using auto restart is a *very* small bandage on a *big* wound, and if (like 35hit) the server doesn't quit, it just stops letting players log on until someone kills it and starts it up again, auto restart is useless... if you leave a crashing server crashing and keep developing it anyway your server will get much worse *very fast*.
So from my point of view, "Use an autostarter" is an admin answer, it's not a developer answer. You can't develop a server that is already crashing regularly.
Re: Crash Heap in NTDLL when server running
You got better solution than auto restarter? if yes just tell it clearly.
I know that auto restarter will work only on server.exe that closes (Zahara menioted that his server.exe closes, and thats why i offered him this solution untill we find a fix to the crashes).
I can give you a example, FroggPT. they have crashes each day (which they use auto restarter to put server back up) almost and they are up since 2006.
Re: Crash Heap in NTDLL when server running
Yes... I do.
Clearly? Fix the problem in the server.exe code. o.O Simple enough?
Re: Crash Heap in NTDLL when server running
All the information you give is simply not important to fix the problem. You can ask the people here if all this information you write (and you are the only know what you are talking about) is helpfull or not.
Re: Crash Heap in NTDLL when server running
Quote:
Originally Posted by
Vovozozo
@Bobsobol
Anyway if the server.exe exits, you can add a auto restarter to auto restart it everytime it crashes.
If you use it. Server will rollback item, exp.... I want to fix it. not use auto restart ^^. I think Sandurr know fix it. :))
Re: Crash Heap in NTDLL when server running
It's okay... for some (other, personal) reason I fell for that bait. XD
No disrespect to Frogg, I know their server and team are good, and I'm sure they do use some Auto Restart, but if their server crashes on a daily basis, I would suggest it is a sign of their popularity, and it is shutting down in an orderly manner in order to prevent those who would otherwise take advantage of it.
I know Zahara has been running E-XPT for at least 2 or 3 years, and I'm reasonably sure he will have something in place to re-initialise a failed service. But what he is describing is not an orderly shutdown. Rather, it is quite a serious crash. If the poor kernel is suffering regularly with this kind of abuse, it is likely to leave the system unstable, and an automatic Server Reboot would be more in order than simply starting the Server executable up again.
Having said that, 35Hit could probably implement an automated server shutdown and then restart every 12 hours or so, as a temporary fix until he figures out where his virtual memory leak (or whatever it turns out to be) is. Then at least players would know 'when' they are going to DC. XD (@35hit, call it "regular maintenance hour" to your players. :wink: Maybe 12 O'Clock every 12 O'Clock. 12:00 & 24:00 hrs server time.)
Back to the main thread issue. I do not see this issue on most servers, so has your server, Zahara always done this, or is it a recent event?
I don't think it's inherent to all PT Servers, and I certainly don't think it's present in the original jPT server... but swapping out server executables for a while would prove that.
Having said that, I know that the original jPT server has clearly been modified to take sql.dll, Clan.dll & URSLogin.dll... these aren't loaded by normal C++ means, so it could be as old as that.
If it's not, finding out which group of server executables do it, and which don't... how wide spread the issue is. Would be a great help. I had assumed it was unique to the E-XPT current server, but if you think Sandurr would have any knowledge of that, you must think the problem is more wide spread than just something you've added your self. (I guess?)
Re: Crash Heap in NTDLL when server running
it's normal crash for other PT. So we have Auto Restart ^^. If PT Server running as offical server, won't have auto restart. I think Turn On DEP, It only terminate msg box error ^^
Re: Crash Heap in NTDLL when server running
thats what im doing restart server every 12 hour
Re: Crash Heap in NTDLL when server running
HeapAlloc, HeapCreate, HeapFree ... ??? That's problem ? Or overflow ??
Re: Crash Heap in NTDLL when server running
Well... I always run with UAC and DEP fully enabled on my servers, if they fail either then they are not running securely and need fixing IMHO.
It's not difficult, and very wise to comply with DEP... specifically because it can run you into problems like this. Remember what it stands for "Data Execution Protection". If you clearly define to the OS what is data, and what is code (which normally linked EXEs do anyway with the .text and .data sections) then the OS will ensure that you don't try to Execute instructions in your data memory, and don't try to write data over your code memory. A pretty sound policy to have enforced, don't you think?
The problem is early teams like Global Fantasy adding a single section to the executable which the declare as Readable, Writeable, Executable, Pre-initialised and uninitialised data. :s It's just not safe to have such an insecure memory section in an executable. You can put the code and static, unchanging data in one section and only dynamic and state based data in another, or your can separate them all out into code, static (read only) data and changing data... but don't put them all in one memory block. XD You can also add no new sections, but instead rewrite the import directory for the executable, add everything that is in your GFantasy section into a .DLL (in appropriate .text, .data .rdata sections, "GFantasy.dll" maybe) and access functions via the normal DLL methods.
I'm fairly sure Sandurr wouldn't have a GFantasy section, as I know he was involved in their developments, and can therefore re-create all of that code on his own, and in a much more system friendly manner.
Heap operations are performed by the Server Executable, and by the OS it's self when a new thread is created or destroyed.
The solution I would propose is that you find an exe which doesn't have this problem, and find out which change between it and your server executable causes it... then you know the code which is written badly.
Short of that, you are in an awful sticky mess of trying to match up pairs of such statements, or others that might lead to them.
I've said this would almost never happen in high level programming, so debugging it is going to be very tricky... but: it may be possible to spot such anomalies by globally patching and logging any and all such API calls. There are some tools to monitor Windows APIs generally, but IDK if they would give you enough information.
Blade API Monitor - Trace and log API and ActiveX interface with parameters
API Monitor: Spy on API Calls and COM Interfaces (Freeware 32-bit and 64-bit Versions!) :: rohitab.com
API Monitor - Spy and display Win32 API calls made by applications
Dev Stuff
The last is probably the most comprehensive free solution, and comes with an SDK that would allow you (if you are skilled enough) to write a custom solution to this debugging problem. (It kinda "did my head in" though, so be warned.)
What you would need to know, is each Allocation API called, from where, why, and on what handle, or address.
Specifically:-- you need to match up allocations of a specific block with their de-allocation. That's why you need the handle or memory block address.
- you need to know where (in running memory) each of these calls is made, so you can find the code in Olly.
- You need to know where the last call was made to it, because of wrappers and because many of the allocations are made by higher level APIs, which will require more in-depth debugging
It's a nasty issue, if it's not due to a code change you made. o.O
Re: Crash Heap in NTDLL when server running
Re: Crash Heap in NTDLL when server running
I can create a bug Heap In NTDLL. I research it yesterday ^^. But i never fix it. I trying
Re: Crash Heap in NTDLL when server running
This is *deep* debugging. I don't expect (short of "Auto Restart") or producing every patch since jPT 4096, and not making the same mistake, that there will be any easy answer.
I trust Zahara will keep at it though. :wink:
Re: Crash Heap in NTDLL when server running
Code:
FAULTING_IP:
ntdll!RtlRestoreLastWin32Error+235
7c82a38b 8b39 mov edi,dword ptr [ecx]
EXCEPTION_RECORD: ffffffff -- (.exr 0xffffffffffffffff)
ExceptionAddress: 7c82a38b (ntdll!RtlRestoreLastWin32Error+0x00000235)
ExceptionCode: c0000005 (Access violation)
ExceptionFlags: 00000000
NumberParameters: 2
Parameter[0]: 00000000
Parameter[1]: 00000000
Attempt to read from address 00000000
PROCESS_NAME: Server.exe
FAULTING_MODULE: 7c800000 ntdll
DEBUG_FLR_IMAGE_TIMESTAMP: 4361c0a8
ERROR_CODE: (NTSTATUS) 0xc0000005 - The instruction at "0x%08lx" referenced memory at "0x%08lx". The memory could not be "%s".
EXCEPTION_CODE: (NTSTATUS) 0xc0000005 - The instruction at "0x%08lx" referenced memory at "0x%08lx". The memory could not be "%s".
EXCEPTION_PARAMETER1: 00000000
EXCEPTION_PARAMETER2: 00000000
READ_ADDRESS: 00000000
FOLLOWUP_IP:
EXPTServer!InitD3D+1b1a71
005c7171 e8f1440000 call Server!InitD3D+0x1b5f67 (005cb667)
FAULTING_THREAD: 00001a40
BUGCHECK_STR: APPLICATION_FAULT_NULL_POINTER_READ_WRONG_SYMBOLS
PRIMARY_PROBLEM_CLASS: NULL_POINTER_READ
DEFAULT_BUCKET_ID: NULL_POINTER_READ
LAST_CONTROL_TRANSFER: from 005c7171 to 7c82a38b
STACK_TEXT:
WARNING: Stack unwind information not available. Following frames may be wrong.
0013d6f4 005c7171 096f0000 00000000 0001b35c ntdll!RtlRestoreLastWin32Error+0x235
0013d730 005c7193 0001b35c 005c71be 0001b35c Server!InitD3D+0x1b1a71
0013d738 005c71be 0001b35c 00000000 005c6f5d Server!InitD3D+0x1b1a93
0013d744 005c6f5d 0001b35c 00000021 00570b3e Server!InitD3D+0x1b1abe
00000000 00000000 00000000 00000000 00000000 Server!InitD3D+0x1b185d
Follow STACK TEXT
BUGCHECK_STR: APPLICATION_FAULT_NULL_POINTER_READ_WRONG_SYMBOLS ???
Search on Google: http://www.google.com.vn/#hl=vi&sour...3df6141400ac8a
Re: Crash Heap in NTDLL when server running
Google response is your question. http://www.google.co.uk/url?sa=t&sou...OHBXkg&cad=rja
lol
Following STACK TEXT it warns that the stack is corrupt and the information is likely wrong. However, InitD3D does exist in the server, and should never be called... most "servers" don't posses the correct hardware (or virtual hardware on VMs or VPS) to "play" the game... and it should only run the D3D code if it is in Client mode, not Server mode.
Code:
mov edi,dword ptr [ecx]
And what was EDI and ECX at the time, and what was at the location pointed to by ECX?
It would appear that ECX was == 0, in which case it's pointing to the first byte of the BIOS, and that is not, and never has been allowed in User Mode code.
What was the return address at the point ntdll!RtlRestoreLastWin32Error was called (235 bytes before the application was shut down), what was passed to it, and from where. I can guarantee it will be a routine in a system library, not called directly from Server.exe, since that imports no functions from NTDLL.dll.
So what does? Well, most DLLs lead back there eventually.- Kernel32.dll
- User32.dll
- GDI32.dll
- AdvAPI32.dll
- Shell32.dll
- OLE32.dll
- IMM32.dll
- WSock32.dll
all lead directly to NTDLL.dll, but also... MSVCRT.dll leads to NTDLL.dll, as do many of the other low level helper libraries... so DSound.dll, DDraw.dll, ShlWAPI.dll and ODBC32.DLL will lead back to a dependency on it pretty quick too.
This type of error seems to be commonly associated with memory fragmentation in .NET and Java applications.
I've suffered no adverse effect from removing the dependencies to DSound and DDraw, and then going through removing any routine which utilised calls to them, and any data which is only used by those routines I've removed. Rather like the way I get rid of all XTrap related code in Clients (there's a guide up here somewhere). The exception is that you can't use the server as a client any more. (Boo hoo... Naht!)
I did wonder if DDraw was used in any way by the server to handle the 3D representation of the maps in memory... but it seems not to be.
--- EDIT ---
As a side-note, this particular error would never happen on my current working server, as NTDLL.dll in my OS does not export a function RtlRestoreLastWin32Error. So I know you are using a new OS that Windows 2000. XD
I don't know what error would occur on my OS, but I know it wouldn't look like that. If *my* NTDLL.dll exported that function, I could look back at what system libraries implemented by the server might call it, and could then set breakpoints, or redirect calls to those routines to a debug log and execute routine so that I would know (when the server crashes) which routine was last called, by looking at the log my code produced. (the last entry, of course)
Multiple sample logs may show that several different root routines cause this error, depending on which one of them is called at the point where resources have become unavailable, or too fragmented... however, these routines should have some common feature, all allocating blocks of memory, or setting privileges or creating thread or such. This will tell you which resource is in contention, and might prompt closer monitoring of that resource, and possibly implementing garbage collection, defragmentation, error checking and clean exception code that would minimise the effect (possibly reducing player count, de-allocating monsters or maps which are not presently in use or just denying new logins till the "state of emergency" is resolved, rather than dumping the entire server process.)
Alternatively, it may be that some *new* routine (Something in GFantasy or such) is creating more fragmentation than is necessary, and by adapting it to use a memory management routine (with built in garbage collection, such as those in MSVCRT) in it's place would alleviate the situation all-together.
I'm still just speculating. There is much that needs to be discovered about the root cause of this error. And as *I* have never seen it, I can't look for you. (sorry)
--- EDIT ---
I shared a video link for a Microsoft Developer (Arun Kishan) who worked on this system in Vista talking about how it operates. I've also found an old backup I made of a Microsoft Systems Journal from 1997 where Matt Pietrek discusses the implementation of SEH, both in the Visual C compiler and in the OS it's self.
It's hard-core, and I'm not sure I understand it all, but you may find something useful in it. >link<