corrupt exports with Ubuntu Server 12.04.1 (64 bit)

  • I'm attempting to move my ConQuest install from an old Windows server over to a newer box with Ubuntu Server 12.04.1 (64 bit). After getting it all setup and moving my configs over from the Windows box, I immediately ran into an issue. The ConQuest server on the Ubuntu box is able to receive images, but fails to export them the majority of the time (the destinations are abruptly dropping connections mid transfer). Images that do successfully export end up being corrupted on the receiving end. I figured I goofed up something with the configs, so I did a fresh build of 1.4.16k, left the vanilla configs intact and added one simple export converter. Same problem. I tried various types of transfer syntaxes and image sets. Same problem (even the sample image data fails). It doesn't matter if you initiate the transfer via an export converter, a dicom query or a manual push from the ConQuest server. The end result is always the same, failed transfers or corrupted pixel data. I should note here that ConQuest is storing inbound images correctly. The corruption only occurs during an outbound transfer.


    So I did packet captures with Wireshark. Everything in the header looks correct until it gets to the pixel stream. That's when things go wonky. Usually the destination notices that the data is corrupt and drops the connection (or in some cases outright crashes).


    Thinking that I may have a hardware or network issue, I did a fresh install of 12.04.1 in a vm (vanilla install with nothing else but build-essential, apache2 and unzip). Same problem. So then I tried Ubuntu Server 10.10 (64 bit) in another vm and it worked! Out of curiosity, I tarred up the build of ConQuest from the 10.10 vm and moved it over to the original 12.04.1 box. It seems to work fine.


    So has anyone else ran into this issue? Is it safe to use the build from the 10.10 vm on the 12.04.1 box?

  • Hi,


    can you try to downgrade and see where the problem disappears? With ftp you can browse the download folder for older versions. conquestlinux1415.tar.gz with dgatepatch1415c.zip may be a good candidate to try, afterwards to compression/decompression changed drastically.


    Marcel

  • Hi,


    I run on Ubuntu 10.04 without problems. And you are saying that if you compile the same package on a 10.10 vm, the dgate binary runs without issues on 12.04? I guess that indicates a compiler/library issue.


    Marcel

  • Marcel,


    Hi. Sorry. I corrected a spelling mistake in the original post and didn't realize it would get sent it back into the approval queue. ;)


    Quote from marcelvanherk

    I run on Ubuntu 10.04 without problems. And you are saying that if you compile the same package on a 10.10 vm, the dgate binary runs without issues on 12.04? I guess that indicates a compiler/library issue.


    Exactly. If I take the compiled package from the 10.10 vm and move it over to the 12.04 server, outbound transfers from dgate work as expected. I'm just wondering if it's a good idea to use a binary compiled for one target on another. I'm still a bit ignorant when it comes to Linux.


    Quote from marcelvanherk

    can you try to downgrade and see where the problem disappears? With ftp you can browse the download folder for older versions. conquestlinux1415.tar.gz with dgatepatch1415c.zip may be a good candidate to try, afterwards to compression/decompression changed drastically.


    I couldn't get 1.4.15c to compile under Ubuntu Server 12.04.1 (64 bit). At first it was stopping with this error:

    Code
    npipe.cpp: At global scope:npipe.cpp:13:17: error: ‘http’ does not name a typecp: cannot stat `dgate': No such file or directory


    I took a look at npipe.cpp from dgatepatch1415c.zip and realized that line 13 was missing its '//'. After adding that in, now it stops with this error:

    Code
    /tmp/cc1ZZ0X2.o: In function `PrefetchPatientData(char*, int)':total.cpp:(.text+0x67e37): undefined reference to `MakeSafeString(char*, char*)'/tmp/cc1ZZ0X2.o: In function `prefetcherthread(conquest_queue*)':total.cpp:(.text+0x69952): undefined reference to `MakeSafeString(char*, char*)'collect2: ld returned 1 exit statuscp: cannot stat `dgate': No such file or directory


    1.4.15 does compile, but it has the same transfer issue as 1.4.16k:

    Code
    UPACS THREAD 1: STARTED AT: Fri Sep 14 12:04:55 2012A-ASSOCIATE-RQ Packet Dump Calling Application Title : "EFILM " Called Application Title : "CONQUESTSRV1 " Application Context : "1.2.840.10008.3.1.1.1", PDU length: 16384 Number of Proposed Presentation Contexts: 1 Presentation Context 0 "1.2.840.10008.5.1.4.1.2.2.2" 1Server Command := 0021Message ID := 0005 C-Move Destination: "EFILM "(QualifyOn) (mapped) IP:xxx.xxx.xxx.xxx, PORT:4006MyStudyRootRetrieveGeneric :: SearchOnQuery On ImageIssue Query on Columns: DICOMImages.SOPClassUI, DICOMImages.SOPInstanc, DICOMSeries.SeriesInst, DICOMStudies.StudyDate, DICOMStudies.StudyTime, DICOMStudies.AccessionN, DICOMStudies.ReferPhysi, DICOMStudies.StudyDescr, DICOMStudies.PatientNam, DICOMStudies.PatientID, DICOMStudies.PatientBir, DICOMStudies.PatientSex, DICOMStudies.StudyInsta, DICOMStudies.StudyID,DICOMImages.ObjectFile,DICOMImages.DeviceNameValues: DICOMStudies.StudyInsta = '1.3.46.670589.5.2.10.2156913941.892665384.993397' and DICOMSeries.StudyInsta = DICOMStudies.StudyInsta and DICOMImages.SeriesInst = DICOMSeries.SeriesInstTables: DICOMImages, DICOMSeries, DICOMStudiesRecords = 2Number of Images to send: 2MyStudyRootRetrieveGeneric :: RetrieveOnLocating file:MAG0 samples/0001_002000_892665661.v2Locating file:MAG0 samples/0001_003000_892665662.v2Sending file : ./data/samples/0001_002000_892665661.v2Image Loaded from Read Ahead Thread, returning TRUERetrieve: remote connection dropped after 0 images, 2 not sentC-Move (StudyRoot)UPACS THREAD 1: ENDED AT: Fri Sep 14 12:09:56 2012UPACS THREAD 1: TOTAL RUNNING TIME: 301 SECONDS


    Here's the error from the eFilm side:

    Code
    (3580) 09-14 16:09:15.88 ERROR: ServerChild caught CException(3580) 09-14 16:09:15.88 Out of memory.


    Here's the result of another test with 1.4.15 using an export converter to a Candelis image router:

    Code
    ExportConverter0.0: forward ./data/samples/0001_003000_892665662.v2 to TEST*** ExportConverter0.0: Forward failed to send DICOM image to TESTExportConverter0.0: forward ./data/samples/0001_002000_892665661.v2 to TEST*** ExportConverter0.0: Forward failed to send DICOM image to TEST


    Here's the error on the Candelis side:

    Code
    ERROR 2012-09-14 16:34:19.498 ig.imagemgr C-STORE SCP failed, received corrupt DICOM file [/dicom/v4/TEST/CT_505394cb7bf237c9.dcm]: Invalid Stream.
    ERROR 2012-09-14 16:34:19.499 ig.imagemgr C-STORE SCP failed for instance [1.3.46.670589.5.2.10.2156913941.892665340.475317] with status [0xc000].
    ERROR 2012-09-14 16:34:19.612 ig.imagemgr C-STORE SCP failed, received corrupt DICOM file [/dicom/v4/TEST/CT_505394cb5fe66687.dcm]: Invalid Stream.
    ERROR 2012-09-14 16:34:19.613 ig.imagemgr C-STORE SCP failed for instance [1.3.46.670589.5.2.10.2156913941.892665339.718742] with status [0xc000].


    I also compiled 1.4.14 under 12.04 and it has the outbound transfer issue as well.


    1.4.15 & 1.4.15c both compiled under Ubuntu Server 10.10 (64 bit) and work without issues.


    I'd like to run 12.04, but It's not a big deal if I need to roll the server back to 10.10. I'm just curious to know if anyone else has had this issue or knows of a solution.


    By the way, I've been using ConQuest for a number of years. It's always met our needs and I'm continually impressed by the flexibility offered on the scripting side. I just want to say thank you for all of your hard work and continued support!

  • Hi,


    The issue is new, and apparantly sits in the library (which has hardly changed over releases) when compiled on the new ubuntu. I don't think there is an issue with cross-compilation, but as usual I would like to get to the bottom of the problem. Thanks for the extensive testing. And I realy appreciate your thanks as well!


    I guess I will have to upgrade my linux test system to the newest ubuntu, but I do'nt think this will happen for the upcoming 1.4.17 release. Bruce Barton is working on a refacturing of the code specifically targeted at unix like systems and we anticipate that his code base will form the base of the 1.4.18 releases. At that time we will make sure that abuntu 12 will be part of the test suite.


    But can you at the mean time post a partial wireshark hex dump showing where the transfer fails, maybe comparing to a dump where the sending succeeds, to give a better clue where to locate the problem? My guestimate is that it may have to do with multi-character constants used in the original parts of the library.


    Regards,


    Marcel

  • Marcel,


    I'm not sure if this is exactly what you need, but here's a zip with reassembled hex streams from wireshark (using the conquest sample data): https://www.dropbox.com/s/qbwv…iled_with_ubuntu12041.zip. I tried to attach it to this post, but couldn't for some reason.


    Here's a screenshot of where the streams diverge: http://i.imgur.com/ifMVZ.png


    Initially I thought it was just the pixel data at the end that was getting corrupt, but it's actually diverging around 4096 bytes (regardless of where the pixel data actually starts). I confirmed this with another data set as well.


    Even more interesting is that there's a pattern that occurred with both data sets that I tested:


    Code
    0000 - 0fff same
    1000 - 656f diff
    6f60 - 7fff same
    8000 - fe0f diff
    fe10 - ffff same
    ...


    If that's not what you need, let me know!

  • Hi,


    your posts and uploads are very helful, but I am a bit lost where to look for this problem, maybe in buffer.cxx, which uses an outgoing buffer size of 32600, but 6f60 is 28512...


    Could you maybe try to change DEFAULT_BREAK_SIZE is buffer.cxx to e.g., 8192 and see if the problem moves to another stream location?


    Marcel

  • Marcel,


    I think you're on to something. Changing DEFAULT_BREAK_SIZE to 8192 resolves the problem (100% identical streams)!


    Out of curiosity, I also tried a 16384 which shifted the problem a bit and resulted in the following:

    Code
    0000 - 0fff same
    1000 - 2fff diff
    3000 - 4fff same
    5000 - 6fff diff
    7000 - 8fff same
    9000 - afff diff
    ...
  • Marcel,


    I played around with BlockSize in pdata.cxx, but it only shifted the problem around. It never really resolved it.


    BlockSize=8192

    Code
    0000 - 1fff same2000 - 5fff diff6000 - 7fff same8000 - fe0f diff


    BlockSize=16384

    Code
    0000 - 7fff same8000 - fe0f difffe1f - 17fff same18000 - 1fb6f diff


    BlockSize=32768

    Code
    0000 - 17fff same18000 - 1fb6f diff1fb70 - 300ff same


    BlockSize=65536

    Code
    0000 - 300ff same30100 - eof diff


    BlockSize=131072

    Code
    0000 - 1fffff same
    20000 - 27a1f diff
    27a20 - eof same


    BlockSize=262144 mostly worked for the sample data (pixel data was intact, but the start of the stream was slighty offset which made doing a diff difficult), but a 40MB multi-frame utrasound tripped it up. I suspect the sample data only worked because it all fit into a single block.

  • Hi,


    ok, the issue thus involves both constants, i.e., the interaction between between pdata and buffer goes wrong. As I said the code is ancient and it is also cryptic. Can you therefore enable the print statements (put in there by Davis people) in pdata.cxx and buffer.cxx. Comparing these between both builds might give some more clues on the error. I would suggest to go back to the orginal settings for further tests.


    Thanks,


    Marcel

  • Marcel,


    The existing print statements in pdata.cxx and buffer.cxx have not been yielding any interesting results. The chunks being read into and out of the buffer end up being the same sizes. I did see a difference being attributed to the print statement on line 583 in buffer.cxx, but I believe it's due to Length being initialized and not being assigned a value until the while loop that occurs below the print statement. Perhaps they meant it to be ILength? Also the print statement on line 588 of buffer.cxx causes the compile to error out. ABS.Size doesn't look to be a valid reference. Perhaps they meant it to be ABS.Buffersize? But that didn't work either.


    I'll keep digging, adding in some new print statements to see if I can figure out where it diverges.

  • Hi,


    Thanks a lot. I just received another report of Ubuntu Server 12.04.1 (64 bit) failing, both in the mysql interface (unexpected) and in image transfer to a WADO web viewer (which uses a dicom transfer, so this is expected). It may be an issue in a ubuntu library....


    Marcel

Participate now!

Don’t have an account yet? Register yourself now and be a part of our community!