multi threading

  • Hi,


    If you are willing to help I can send you a test version of dgate of 1.5.0beta with extended logging for the ImportConverter forwarding issue.


    As far as mulitheading goes, conquest is able to fill most of a IGB ethernet connection on single thread (reading and sending is on seperate threads) for export converters. What network speed do you have and can see how much of it is used when doing a delayed forward on an export converter? The main reason to use multiple threads is to increase network occupation or avoid delays when there is a large network latency.


    On my 1G office network, a single threaded send runs at 600 Mbps if I ask the receiving server to destroy (not save) the image.


    I may be able to add worker threads to the delayed forward relatively easily to allow multiple moves to be initiated at once. Each forward would run on a single thread but multiple would run simultaneous.


    Marcel

  • I would be more than happy to help. But, I need to get a solution in place and right now I am working with MIRTH to handle the dicom side (3 separate installs, 3 separate servers) and another MIRTH for just HL7 messages. I don't know if it will work properly or not or is fast enough, but I guess I will see.


    Here is a simple description:


    Datacenter with 23 servers. Of those, 11 are on separate fiber connections for internet. That would be PACS, SQL database, Web sites, routers, etc. Every one is also connected to internal network at 1 GB and everything is on CAT6. All my fiber connections are 900 up/down (at least, we usually get better speeds). The actual fiber hub for the entire pond is also in my datacenter. They decided a few years ago (been open for 10 years now) that since I was bringing in so many lines over the years to move off the pole and put right in the datacenter for easy access and future connections and upgrades.


    Our current router software used for incoming studies is on 3 servers, all of them Windows 2008R2, all of them on separate fiber lines at speeds stated above. The data is processed by this router software and then pushed to the PACS. The PACS uses 6 different network cards. 3 of them are for the incoming studies from the router software, 2 of them for outbound studies, the 6th one is for internet access.


    The current router software is very old and from a company called Clario. It is based on Clear Canvas and it extracts the dicom header information needed to populate a worklist (through SQL) from each study, does data coercion if needed, and adds accession numbers to any study that does not have one incoming. It then pushes the studies across the internal network to the PACS.


    The issues with it are:


    1. It is a 32 bit program and only allows 2 dicom connections at a time

    2. It does not understand any compression above JPEG Lossless

    3. No support or upgrades in 3 years and is now totally unsupported


    Also, I am replacing all the Windows 2008R2 servers with either Windows 10 PRO (if I don't need server OS) or Windows 2016.


    So, I need a solution that can handle at least 800 studies per day incoming (MRI, CT, MG, US, DR, NM, etc.) AND be able to route out to at least 3 destinations at the same time.


    That is why multi-threading is so important. My outbound is great. I have anywhere from 20 to 40 threads running at a time spread over 3 network cards. Those threads are multiple studies level threads sending out at JPEG LS compression.


    I would really like inbound to be the same. Since I don't have anyway to extract the dicom header data and send via HL7 to SQL except MIRTH, I am attempting to setup 3 MIRTH on 3 separate servers (mirroring current router software setup) for dicom and then all of them will route to another MIRTH on another internal ONLY server for the HL7 dicom extraction to the SQL server and also route all dicom studies to the PACS. I have to have multi-threading dicom connections and sends to the PACS. One study at a time from each MIRTH is too slow and image by image will clog up the network and slow it down horribly. I actually tested all the outbound to go by the image level and it was a disaster and I quickly had to switch back to study level.


    Right now I can get a 500 slices CT or 200 slice MRI to a radiologist in under 1 minute and that is going to 3 destinations at once. So all 3 would have study in under 1 minute. US and DR are in seconds. I need that kind of speed internally to the PACS.


    All professional router software out there would run about $15,000 per instance and around $2,500 each per year for service contracts.


    My solution was to use Conquest for inbound dicom, do the data coercion as needed, add accession number if needed, forward to MIRTH server for it to do the HL7 extraction and forwarding to the SQL server (and then dump the dicom images) and at the same time route the dicom studies to the PACS over the internal network to

  • Ok,


    by the sound of it it may be good to explore two strands.


    1) fix the ImportConverter issue with ForwardAssociationLevel

    2) add worker threads to delayed exportconverter


    These changes will be on 1.5.0beta. Would you mind trying this version already? Just use dgate64.exe from the 1.5.0beta download. No other changes required.


    Marcel

  • Hi,


    1) will log every time the reason why non-delayed forward association is closed


    2) will work as follows:


    DelayedForwarderThreads = 5

    ExportConverters = 1

    ExportConverter0 = forward study to OFFICE after 10


    Each incoming association will trigger a forward that is single threaded. But if up to 5 assocations send data at once or close together, up to 5 forwarders can be triggered and send overlapping. This is by far the simplest change and it could do the job.


    I will do a bit more testing and then release it on github.


    Marcel

  • Hi,


    you can fetch the new 1.5.0beta1 dgate64.exe from github (link on server web page) with both above changes.


    Also added clause 'split', to experiment with more multithreading. It causes part of the objects to be sent.


    I.e., this will use one outgoing connection per incoming connection (but open up to 5 parallel):


    DelayedForwarderThreads = 5

    ExportConverters = 1

    ExportConverter0 = forward study to PARADIGIT after 10

    This will use two outgoing connection per incoming connection (each up to 5 parallel):


    DelayedForwarderThreads = 5

    ExportConverters = 2

    ExportConverter0 = forward study to PARADIGIT split 0/2 after 10

    ExportConverter1 = forward study to PARADIGIT split 1/2 after 10


    regards


    Marcel

  • I did the config as you said and it did work, after I made a few mistakes that is!. I will continue testing but 20 DR went to Conquest mulit-threading and sent out the same at the study level.


    Question, I tried adding the "org %u" command at the end of the lines and it locked up Conquest and killed the server. Is it still possible to use the "send as received AET" option or do I have it wrong or in wrong place?

  • Hi,


    Great to hear that, which of the 2 possibilities did you try? If option 1 works it is preferable, option 2 has more overhead.


    Clause "org" does not exist on the delayed converters. This is an error in the manual.


    It is possible but a bit of a change to add it.....


    Marcel

  • understood about "send as". That, right now until I can figure out how to data coercion and also how to add accession numbers if none are there, is going to be needed.


    I did them both. But, the second one I was very pleased with. 20 DR studies from originating PACS to RSA PACS in under 2 minutes.


    My next step is to use both and put under a much larger load (include CT, MR, MG, etc., more sites sending, and higher study number) and see how it performs, but I am very encouraged right now.

  • Ok,


    data coercion is through the "script" option to be placed at the end of each forward statement. The script will then be run on the data of each outgoing image. If you can show in pseudocode what you want to do I can tell you how to do it.


    I hope multithreading works well. Especially the DelayedForwarderThreads option (where multiple threads serve a single queue) has to be tested thoroughly for random crashes - this is the first time I use my (ancient) queue code this way. But with your data volume I assume we will detect any problems quickly.


    The overhead of "split" relates to both forward command doing a full query of the images to send and then each delete either the even or the odd images in the result. This is not a very pretty implementation but the one with minimal changes and thus the most reliable.


    Let me know how urgent the "org" option is, the code required to implement this is quite messy.


    regards


    Marcel

  • you are the best. I will try to work with the non-split one and put a real load on it and then do the same with the split and monitor closely. If the results are close, I will use the non split.


    If I can get the data coercion AND the insertion of accession numbers (where dicom tag 0008,0050 is empty), then I don't need the "org" option.


    I have been preparing a spreadsheet with all the data coercion rules I will need as if I can't do it in Conquest (and at this time I have no idea since I don't know anything about scripting) I will need to do in the PACS. If that would helpful, once I complete and check it to make I am not missing anything, I can upload if this system takes Excel files. BUT, if I have to it in PACS, then I will have to be able to a "send as".


    If I have ONE example of how to add a prefix to an MRN and a suffix to an accession number (only if the accession is included with the study. If it is blank, then I need Conquest to insert a random accession number at least 8 digits long and NO data coercion) and also how to have it insert accession number if needed, I can do the rest. I work best when I see how it works (example) as then I can understand it or have fewer questions at least.

  • Ok,


    If you have a lot of rules, you may best do something like this - first add to dicom.ini ([lua] section always as last):


    [lua]

    association = dofile('lua/coercionrules.lua')


    This will load all functions and data in file lua/coercionrules.lua into memory for each new association.


    The exportconverter could then look like, e.g.:


    ExportConverter0 = forward study to PARADIGIT after 10 script lua "coerce_rule1('%u')"


    and coercionrules.lua would be plain Lua (in lua subfolder on server) and could define coerce_rule1 and many other things:

    Hope this gives you some ideas. This method has least overhead for complex operations.


    To test download zerobrane studio and use install.lua. You can then run scripts independently and develop with code completion.


    E.g. write test.lua:


    dofile('lua/coercionrules.lua')

    Data = DicomObject:new()

    Data:load('c:\\temp\\sample.dcm')

    coerce_rule1('SENDER1')


    Marcel

  • To make a regular dicom move multithreaded this script can be used (requires update after 1.5.0beta1 to be able to read Command.MoveDestination from script):


    Code
    [lua]
    RetrieveConverter0 = if Data["9999,0b00"] then return end; Data["9999,0b00"]="0/2"; local s={}; s[1]='local dst="'..Command.MoveDestination..'"'; s[2]='local ddo='..Data:Serialize()..';local ddo2=DicomObject:new();for k,v in pairs(ddo) do ddo2[k]=ddo[k] end'; s[3]='ddo2["9999,0b00"]="1/2"'; s[4]='print("start move to "..dst)' s[5]='dicommove(Global.MyACRNema, dst, ddo2)'; servercommand('luastart:'..table.concat(s, ' '));

    It modifies the original c-move to move half the slices (0/2), then constructs the code to start the move of the other half slices (1/2) in table s, and then starts that on another thread with servercommand('luastart:'..). To avoid infinite recursion it checks on the split tag "9999,0b00" before starting. Not for the faint at heart.


    Marcel

  • Hi,


    GitHub just updated to support above on-line script. I would work as well for delayed forward, as this initiates a move.


    It may be more readable to put the routine in a separate lua file. Here it is, just tested with 3-30 threads (only use many threads on high latency long distance links - each thread has quite some overhead):


Participate now!

Don’t have an account yet? Register yourself now and be a part of our community!