processing large volumes #2027
Replies: 5 comments 20 replies
-
Updating with this image below. |
Beta Was this translation helpful? Give feedback.
-
Hi @paulobreim. Audio transcription is a heavy task and this is expected. On #1909 @wladimirleite improved a bottleneck you had helped to find. If your CPU is 100%, probably there is nothing we can do. And currently I'm on vacation and I won't be able to help you analyze your case processing. |
Beta Was this translation helpful? Give feedback.
-
Well, as this file has already been processed for more than 48 hours and CPU usage is very low, I decide to interrupt and restart the process with --continue. In a while we will see if it recovers with high CPU. |
Beta Was this translation helpful? Give feedback.
-
Finally the last file was processed, although it did not finish normally. There were 5 or 6 reprocessing, but one of them the log was zero because Windows simply booted the machine and as only iped was running, I think this reboot was caused by Java. Taking advantage of the end of the year, I would like to thank the entire community that helps maintain this project, especially @lfcnassif @wladimirleite . You are points outside the curve. |
Beta Was this translation helpful? Give feedback.
-
Following your advice, I restarted all processing in groups of 5 images. I put everything in a .bat that ran for a few days. Looking at the last processing, it gave the famous OOME in one of the UFDR images, so I decided to investigate a little deeper, to generate subsidies that allow a better analysis and see how interesting it is: 1 - The command line that gave an error was: 2 - So I did a new indexing just with this image with the following command line 3 - So I decided to reprocess this file based on the error, imagining that everything would go well and I used the following line 4 - After 1:20 the situation was the same. The same task takes 63 minutes of processing time, that is, practically only it was being executed, although from time to time, other tasks appeared and ended quickly (IndexTask), which I believe is as expected. 5 - After a few more minutes, the task manager shows that the task was no longer responding, although it indicates CPU usage. And in fact, when I clicked on iped it no longer responded. 6 - After a few more minutes, the CPU usage became zero and I noticed that there were messages on the console. I decided to stop processing because it was frozen. 7 - As it generated a dump, I uploaded it which can be obtained from this link 8 - Finally, follow the logs of the processing that ended ok and the one that had an error. Note that they present different results. https://drive.google.com/file/d/1ZA4fVVqwcg8NlkeLgTlg69GzVleP5nzH/view?usp=sharing In theory, if the isolated processing of this image was very fast and without any errors, something must be causing the long processing time when the process uses --continue. I'm available for further testing. paulo |
Beta Was this translation helpful? Give feedback.
-
A few weeks ago I posted a case here that has a lot of processing images.
After some problems that required restarting and continuing processing, I gradually considered different situations.
And now I decided to start everything over again and the screen below shows the situation.
We are talking about processing 43 computer images, which are not actually .dd images, but rather results exported by iped's own dd processing in previous versions, so they are not large volumes.
Added to this we have emails received from Apple and Google and finally 44 cell phone images in .ufdr format.
We haven't reached half of the processing yet, and what has a significant impact is precisely voice transcription.
We have 190 hours of processing so far, and there is still a long way to go.
I don't know how to assess whether it's too long or not, but I think it would be opportune to do an analysis.
If necessary, I can give you access (@lfcnassif @wladimirleite), via anydesk, for you to study.
I remember that in the first test where I did several restarts, one thing that took a long time was checking items that were possibly shared on WhatsApp. At the time it gave me the impression that it checked all images that had already been indexed, but I'm not sure about that.
tks
paulo
Beta Was this translation helpful? Give feedback.
All reactions