How does the external merge sort algorithm work?

Results 1 to 2 of 2
  1. #1
    Newbie betty12 is offline
    MemberRank
    Jun 2022 Join Date
    9Posts

    How does the external merge sort algorithm work?

    I'm attempting to figure out how the external merge sort algorithm works (I noticed other solutions to the same issue, but didn't find what I was looking for). I'm reading Jeffrey McConnell's book "Analysis Of Algorithms" and attempting to build the method mentioned in it.
    For example, I have the following input data: 3,5,1,2,4,6,9,8,7, and I can only load four digits into memory.
    My initial step is to read the input file in four-digit chunks, sort them in memory, and then write one to file A and the next to file B.
    I got:
    Code:
    [COLOR=var(--black-800)]A:[1,2,3,5][7]  
    [/COLOR][COLOR=var(--black-800)]B:[4,6,8,9][/COLOR]
    Now my issue is, how can I merge chunks from these files into larger ones if they won't fit in memory? According to Jeffrey McConnell, I need to read half portions and merge them into files C and D.
    But I got the sequencing wrong:
    Code:
    [COLOR=var(--black-800)]C:[1,2,4,6,3,8,5,9]
    [/COLOR][COLOR=var(--black-800)]D:[7][/COLOR]
    PS: I understand how to merge numbers by reading from a file, so I watched this video, which was really informative, but I'm not sure how to accomplish it with in-memory buffers to decrease I/O operations.
    Could someone kindly offer an example with step-by-step instructions?


  2. #2
    Ginger by design. jMerliN is offline
    Grand MasterRank
    Feb 2007 Join Date
    2,500Posts
    You need to make a working merge sort on regular lists first, the whole thing in memory. That's includes the list merging algorithm.

    You just need to make the list merging algorithm operate on open files instead of lists in memory. You only ever care about the next number in each file, so for the two files you're trying to merge, you only need 2 numbers loaded into memory. Whichever one is smaller gets written to the merged output file instead of being appended to a result list like normal.



Advertisement