2020-04-30 20:53:40,286 - gimme.config - DEBUG - Using multiprocessing 2020-04-30 20:53:40,287 - gimme.config - DEBUG - Parameters: 2020-04-30 20:53:40,287 - gimme.config - DEBUG - fraction: 0.2 2020-04-30 20:53:40,287 - gimme.config - DEBUG - use_strand: False 2020-04-30 20:53:40,287 - gimme.config - DEBUG - abs_max: 1000 2020-04-30 20:53:40,287 - gimme.config - DEBUG - analysis: small 2020-04-30 20:53:40,287 - gimme.config - DEBUG - enrichment: 1.5 2020-04-30 20:53:40,287 - gimme.config - DEBUG - size: 200 2020-04-30 20:53:40,287 - gimme.config - DEBUG - lsize: 500 2020-04-30 20:53:40,287 - gimme.config - DEBUG - background: ['genomic'] 2020-04-30 20:53:40,287 - gimme.config - DEBUG - cluster_threshold: 0.95 2020-04-30 20:53:40,287 - gimme.config - DEBUG - scan_cutoff: 0.9 2020-04-30 20:53:40,287 - gimme.config - DEBUG - available_tools: MDmodule,MEME,MEMEW,DREME,Weeder,GADEM,MotifSampler,Trawler,Improbizer,BioProspector,Posmo,ChIPMunk,AMD,HMS,Homer,XXmotif,ProSampler,DiNAMO 2020-04-30 20:53:40,287 - gimme.config - DEBUG - tools: Homer 2020-04-30 20:53:40,287 - gimme.config - DEBUG - pvalue: 0.001 2020-04-30 20:53:40,287 - gimme.config - DEBUG - max_time: -1 2020-04-30 20:53:40,287 - gimme.config - DEBUG - ncpus: 12 2020-04-30 20:53:40,287 - gimme.config - DEBUG - motif_db: gimme.vertebrate.v5.0.pfm 2020-04-30 20:53:40,287 - gimme.config - DEBUG - use_cache: False 2020-04-30 20:53:40,287 - gimme.config - DEBUG - custom_background: try-0/generated_background.genomic.fa 2020-04-30 20:53:40,287 - gimme.config - DEBUG - genome: sacCer3 2020-04-30 20:53:40,287 - gimme.config - DEBUG - No time limit for motif prediction 2020-04-30 20:53:40,290 - gimme.denovo - INFO - starting full motif analysis 2020-04-30 20:53:40,291 - gimme.denovo - DEBUG - Using temporary directory /tmp/gimmemotifs.25241.gh5nsl10 2020-04-30 20:53:40,292 - gimme.denovo - INFO - using size of 200, set size to 0 to use original region size 2020-04-30 20:53:40,292 - gimme.denovo - INFO - preparing input from BED 2020-04-30 20:53:40,294 - gimme.denovo - DEBUG - Splitting try-0/intermediate/input.bed into prediction set (try-0/intermediate/prediction.bed) and validation set (try-0/intermediate/validation.bed) 2020-04-30 20:53:40,352 - gimme.denovo - DEBUG - Creating genomic background 2020-04-30 20:53:40,458 - gimme.denovo - DEBUG - Creating genomic background 2020-04-30 20:53:40,840 - gimme.prediction - INFO - starting motif prediction (small) 2020-04-30 20:53:40,840 - gimme.prediction - INFO - tools: Homer 2020-04-30 20:53:40,986 - gimme.prediction - DEBUG - Skipping AMD 2020-04-30 20:53:40,986 - gimme.prediction - DEBUG - Skipping GADEM 2020-04-30 20:53:40,987 - gimme.prediction - DEBUG - Skipping Improbizer 2020-04-30 20:53:40,987 - gimme.prediction - DEBUG - Skipping JASPAR 2020-04-30 20:53:40,987 - gimme.prediction - DEBUG - Skipping MEMEW 2020-04-30 20:53:40,987 - gimme.prediction - DEBUG - Skipping ProSampler 2020-04-30 20:53:40,987 - gimme.prediction - DEBUG - Skipping RPMCMC 2020-04-30 20:53:40,987 - gimme.prediction - DEBUG - Skipping trawler 2020-04-30 20:53:40,987 - gimme.prediction - DEBUG - Skipping Weeder 2020-04-30 20:53:40,987 - gimme.prediction - DEBUG - Skipping XXmotif 2020-04-30 20:53:40,987 - gimme.prediction - DEBUG - Skipping BioProspector 2020-04-30 20:53:40,987 - gimme.prediction - DEBUG - Skipping ChIPMunk 2020-04-30 20:53:40,987 - gimme.prediction - DEBUG - Skipping DiNAMO 2020-04-30 20:53:40,987 - gimme.prediction - DEBUG - Skipping DREME 2020-04-30 20:53:40,987 - gimme.prediction - DEBUG - Skipping HMS 2020-04-30 20:53:40,987 - gimme.prediction - DEBUG - Starting Homer job, width 5 2020-04-30 20:53:40,987 - gimme.prediction - DEBUG - Starting Homer job, width 6 2020-04-30 20:53:40,987 - gimme.prediction - DEBUG - Starting Homer job, width 7 2020-04-30 20:53:40,987 - gimme.prediction - DEBUG - Starting Homer job, width 8 2020-04-30 20:53:40,987 - gimme.prediction - DEBUG - Skipping MDmodule 2020-04-30 20:53:40,987 - gimme.prediction - DEBUG - Skipping MEME 2020-04-30 20:53:40,987 - gimme.prediction - DEBUG - Skipping MotifSampler 2020-04-30 20:53:40,988 - gimme.prediction - DEBUG - Skipping Posmo 2020-04-30 20:53:40,989 - gimme.prediction - DEBUG - Skipping YAMDA 2020-04-30 20:53:40,989 - gimme.prediction - INFO - all jobs submitted 2020-04-30 20:53:41,656 - gimme.prediction - INFO - Homer_width_5 finished, found 5 motifs 2020-04-30 20:53:41,657 - gimme.prediction - DEBUG - Starting stats job of 5 motifs 2020-04-30 20:53:41,657 - gimme.prediction - DEBUG - stdout : Running command: /home/zhegalova/anaconda3/envs/gimmemotifs/bin/homer2 denovo -i /home/galitsyna/atacseq/atacseq/results/bwa/mergedLibrary/macs/narrowPeak/try-0/intermediate/prediction.fa -b /home/galitsyna/atacseq/atacseq/results/bwa/mergedLibrary/macs/narrowPeak/try-0/intermediate/prediction.bg.fa -len 5 -S 5 -o /tmp/gimmemotifs.25241.gh5nsl10/Homer.q74p6ouo/homer_w5.auwilh4z -p 8 2020-04-30 20:53:41,657 - gimme.prediction - DEBUG - stdout : Number of Trial motifs (-T) set to 8 (from 10) to work well with 8 CPUs Treating input files as FASTA format Scanning input files... Parsing sequences... |0% 50% 100%| =================================================================================== Total number of Oligos: 512 Autoadjustment for sequence coverage in background: 1.00x Oligos: 512 of 544 max Tree : 1044 of 2720 max Optimizing memory usage... Cache length = 11180 Using binomial scoring Global Optimization Phase: Looking for enriched oligos with up to 2 mismatches... Screening oligos 512 (allowing 0 mismatches): |0% 50% 100%| ===================================================================================== 0.00% skipped, 100.00% checked (512 of 512), of those checked: 0.00% not in target, 0.00% increased p-value, 0.00% high p-value Screening oligos 512 (allowing 1 mismatches): |0% 50% 100%| ===================================================================================== 0.00% skipped, 100.00% checked (512 of 512), of those checked: 0.00% not in target, 66.80% increased p-value, 58.01% high p-value Screening oligos 512 (allowing 2 mismatches): |0% 50% 100%| ===================================================================================== 97.85% skipped, 2.15% checked (11 of 512), of those checked: 0.00% not in target, 2.15% increased p-value, 0.00% high p-value Reading input files... 1672 total sequences read Cache length = 11180 Using binomial scoring Local Optimization Phase: 1 of 5 Initial Sequence: GCGCG... (-44.560) Round 1: -87.51 GCGCG T:257.0(81.67%),B:551.0(30.41%),P:1e-38 Round 2: -87.51 GCGCG T:257.0(81.67%),B:551.0(30.41%),P:1e-38 =Final=: -45.08 GCGCG T:62.0(40.79%),B:174.0(11.45%),P:1e-19 Performing exhaustive masking of motif... Reprioritizing potential motifs... 2 of 5 Initial Sequence: GCCGC... (-29.305) Round 1: -49.47 GCCGC T:246.0(80.28%),B:822.0(41.78%),P:1e-21 Round 2: -50.47 GCCCG T:221.0(76.75%),B:708.0(37.25%),P:1e-21 Round 3: -54.85 CCCGC T:255.0(81.42%),B:785.0(40.35%),P:1e-23 Round 4: -54.85 CCCGC T:255.0(81.42%),B:785.0(40.35%),P:1e-23 =Final=: -38.35 CCCGC T:68.0(44.74%),B:238.0(15.66%),P:1e-16 Performing exhaustive masking of motif... Reprioritizing potential motifs... 3 of 5 Initial Sequence: GTGTG... (-24.683) Round 1: -30.73 GTGTG T:181.0(69.72%),B:747.0(38.84%),P:1e-13 Round 2: -30.73 GTGTG T:181.0(69.72%),B:747.0(38.84%),P:1e-13 =Final=: -12.21 GTGTG T:83.0(54.61%),B:557.0(36.64%),P:1e-5 Performing exhaustive masking of motif... Reprioritizing potential motifs... 4 of 5 Initial Sequence: CGCGG... (-21.200) Round 1: -36.05 GGGGS T:122.0(55.30%),B:421.0(24.20%),P:1e-15 Round 2: -36.05 GGGGS T:122.0(55.30%),B:421.0(24.20%),P:1e-15 =Final=: -21.83 GGGGS T:107.0(70.39%),B:688.0(45.26%),P:1e-9 Performing exhaustive masking of motif... Reprioritizing potential motifs... 5 of 5 Initial Sequence: CTCCG... (-14.439) Round 1: -19.44 CTCCG T:107.0(50.65%),B:490.0(27.56%),P:1e-8 Round 2: -22.08 GGCCG T:122.0(55.30%),B:556.0(30.64%),P:1e-9 Round 3: -22.08 GGCCG T:122.0(55.30%),B:556.0(30.64%),P:1e-9 =Final=: -15.29 GGCCG T:74.0(48.68%),B:440.0(28.95%),P:1e-6 Performing exhaustive masking of motif... Reprioritizing potential motifs... Finalizing Enrichment Statistics (new in v3.4) Reading input files... 1672 total sequences read Cache length = 11180 Using binomial scoring Checking enrichment of 5 motif(s) |0% 50% 100%| ================================================================================= Output in file: /tmp/gimmemotifs.25241.gh5nsl10/Homer.q74p6ouo/homer_w5.auwilh4z Cleaning up temporary sequence and group files: 01684388389.seq 01684388389.group 2020-04-30 20:53:41,700 - gimme.scanner - DEBUG - using background: genome sacCer3 with size 200 2020-04-30 20:53:41,811 - gimme.scanner - DEBUG - Determining mean and stddev for motifs. 2020-04-30 20:53:41,854 - gimme.prediction - INFO - Homer_width_6 finished, found 5 motifs 2020-04-30 20:53:41,854 - gimme.prediction - DEBUG - Starting stats job of 5 motifs 2020-04-30 20:53:41,854 - gimme.prediction - DEBUG - stdout : Running command: /home/zhegalova/anaconda3/envs/gimmemotifs/bin/homer2 denovo -i /home/galitsyna/atacseq/atacseq/results/bwa/mergedLibrary/macs/narrowPeak/try-0/intermediate/prediction.fa -b /home/galitsyna/atacseq/atacseq/results/bwa/mergedLibrary/macs/narrowPeak/try-0/intermediate/prediction.bg.fa -len 6 -S 5 -o /tmp/gimmemotifs.25241.gh5nsl10/Homer.5hh00q8h/homer_w6.9nodp5qt -p 8 2020-04-30 20:53:41,854 - gimme.prediction - DEBUG - stdout : Number of Trial motifs (-T) set to 8 (from 10) to work well with 8 CPUs Treating input files as FASTA format Scanning input files... Parsing sequences... |0% 50% 100%| =================================================================================== Total number of Oligos: 2080 Autoadjustment for sequence coverage in background: 1.00x Oligos: 2080 of 2161 max Tree : 4188 of 10805 max Optimizing memory usage... Cache length = 11180 Using binomial scoring Global Optimization Phase: Looking for enriched oligos with up to 2 mismatches... Screening oligos 2080 (allowing 0 mismatches): |0% 50% 100%| =============================================================================== 0.00% skipped, 100.00% checked (2080 of 2080), of those checked: 0.00% not in target, 0.00% increased p-value, 0.00% high p-value Screening oligos 2080 (allowing 1 mismatches): |0% 50% 100%| =============================================================================== 0.00% skipped, 100.00% checked (2080 of 2080), of those checked: 0.00% not in target, 45.29% increased p-value, 51.01% high p-value Screening oligos 2080 (allowing 2 mismatches): |0% 50% 100%| =============================================================================== 62.02% skipped, 37.98% checked (790 of 2080), of those checked: 0.00% not in target, 37.98% increased p-value, 0.00% high p-value Reading input files... 1672 total sequences read Cache length = 11180 Using binomial scoring Local Optimization Phase: 1 of 5 Initial Sequence: GCGCGG... (-84.345) Round 1: -102.63 GCGCSG T:237.0(79.08%),B:425.0(24.40%),P:1e-44 Round 2: -103.14 GSSGSG T:250.0(80.80%),B:445.0(25.39%),P:1e-44 Round 3: -105.68 GSCGCG T:185.0(70.51%),B:292.0(17.48%),P:1e-45 Round 4: -105.68 GSCGCG T:185.0(70.51%),B:292.0(17.48%),P:1e-45 =Final=: -55.96 GSCGCG T:61.0(40.13%),B:136.0(8.95%),P:1e-24 Performing exhaustive masking of motif... Reprioritizing potential motifs... 2 of 5 Initial Sequence: CGGCCC... (-36.918) Round 1: -59.30 CGGCCC T:268.0(82.95%),B:798.0(40.85%),P:1e-25 Round 2: -59.30 CGGCCC T:268.0(82.95%),B:798.0(40.85%),P:1e-25 =Final=: -46.55 CGGCCC T:110.0(72.37%),B:531.0(34.93%),P:1e-20 Performing exhaustive masking of motif... Reprioritizing potential motifs... 3 of 5 Initial Sequence: TGCGCG... (-22.126) Round 1: -30.34 TGCGCG T:182.0(69.92%),B:769.0(39.72%),P:1e-13 Round 2: -33.25 WGCGCG T:71.0(37.42%),B:196.0(12.10%),P:1e-14 Round 3: -33.25 WGCGCG T:71.0(37.42%),B:196.0(12.10%),P:1e-14 =Final=: -26.11 WGCGCG T:48.0(31.58%),B:165.0(10.86%),P:1e-11 Performing exhaustive masking of motif... Reprioritizing potential motifs... 4 of 5 Initial Sequence: TGTGTG... (-17.514) Round 1: -34.73 TGTGTG T:94.0(46.23%),B:296.0(17.70%),P:1e-15 Round 2: -34.73 TGTGTG T:94.0(46.23%),B:296.0(17.70%),P:1e-15 =Final=: -9.88 TGTGTG T:65.0(42.76%),B:422.0(27.76%),P:1e-4 Performing exhaustive masking of motif... Reprioritizing potential motifs... 5 of 5 Initial Sequence: GGGTAA... (-15.977) Round 1: -18.46 GGGTAA T:80.0(41.02%),B:348.0(20.47%),P:1e-8 Round 2: -18.46 GGGTAA T:80.0(41.02%),B:348.0(20.47%),P:1e-8 =Final=: -10.87 GGGTAA T:46.0(30.26%),B:251.0(16.51%),P:1e-4 Performing exhaustive masking of motif... Reprioritizing potential motifs... Finalizing Enrichment Statistics (new in v3.4) Reading input files... 1672 total sequences read Cache length = 11180 Using binomial scoring Checking enrichment of 5 motif(s) |0% 50% 100%| ================================================================================= Output in file: /tmp/gimmemotifs.25241.gh5nsl10/Homer.5hh00q8h/homer_w6.9nodp5qt Cleaning up temporary sequence and group files: 01684388389.seq 01684388389.group 2020-04-30 20:53:41,866 - gimme.scanner - DEBUG - Determining mean and stddev for motifs. 2020-04-30 20:53:41,888 - gimme.scanner - DEBUG - using background: genome sacCer3 with size 200 2020-04-30 20:53:41,947 - gimme.scanner - DEBUG - Determining mean and stddev for motifs. 2020-04-30 20:53:42,115 - gimme.prediction - INFO - Homer_width_7 finished, found 5 motifs 2020-04-30 20:53:42,115 - gimme.prediction - DEBUG - Starting stats job of 5 motifs 2020-04-30 20:53:42,115 - gimme.prediction - DEBUG - stdout : Running command: /home/zhegalova/anaconda3/envs/gimmemotifs/bin/homer2 denovo -i /home/galitsyna/atacseq/atacseq/results/bwa/mergedLibrary/macs/narrowPeak/try-0/intermediate/prediction.fa -b /home/galitsyna/atacseq/atacseq/results/bwa/mergedLibrary/macs/narrowPeak/try-0/intermediate/prediction.bg.fa -len 7 -S 5 -o /tmp/gimmemotifs.25241.gh5nsl10/Homer.8sq4pbj3/homer_w7.403n_vu3 -p 8 2020-04-30 20:53:42,115 - gimme.prediction - DEBUG - stdout : Number of Trial motifs (-T) set to 8 (from 10) to work well with 8 CPUs Treating input files as FASTA format Scanning input files... Parsing sequences... |0% 50% 100%| =================================================================================== Total number of Oligos: 8192 Autoadjustment for sequence coverage in background: 1.00x Oligos: 8192 of 8628 max Tree : 16692 of 43140 max Optimizing memory usage... Cache length = 11180 Using binomial scoring Global Optimization Phase: Looking for enriched oligos with up to 2 mismatches... Screening oligos 8192 (allowing 0 mismatches): |0% 50% 100%| ================================================================================ 6.36% skipped, 93.64% checked (7671 of 8192), of those checked: 6.36% not in target, 0.00% increased p-value, 0.00% high p-value Screening oligos 8192 (allowing 1 mismatches): |0% 50% 100%| ================================================================================ 6.36% skipped, 93.64% checked (7671 of 8192), of those checked: 0.00% not in target, 35.36% increased p-value, 49.69% high p-value Screening oligos 8192 (allowing 2 mismatches): |0% 50% 100%| ================================================================================ 62.11% skipped, 37.89% checked (3104 of 8192), of those checked: 0.00% not in target, 34.58% increased p-value, 0.00% high p-value Reading input files... 1672 total sequences read Cache length = 11180 Using binomial scoring Local Optimization Phase: 1 of 5 Initial Sequence: GCGCGGC... (-70.278) Round 1: -113.92 GCGCGSS T:287.0(84.96%),B:478.0(26.99%),P:1e-49 Round 2: -113.92 GCGCGSS T:287.0(84.96%),B:478.0(26.99%),P:1e-49 =Final=: -64.10 GCGCGSS T:73.0(48.03%),B:177.0(11.64%),P:1e-27 Performing exhaustive masking of motif... Reprioritizing potential motifs... 2 of 5 Initial Sequence: GGGGGGG... (-34.696) Round 1: -52.69 GGGGGGG T:52.0(29.05%),B:70.0(4.50%),P:1e-22 Round 2: -54.79 SGGGGVG T:241.0(79.62%),B:750.0(38.96%),P:1e-23 Round 3: -54.79 SGGGGVG T:241.0(79.62%),B:750.0(38.96%),P:1e-23 =Final=: -37.29 SGGGGVG T:97.0(63.82%),B:469.0(30.86%),P:1e-16 Performing exhaustive masking of motif... Reprioritizing potential motifs... 3 of 5 Initial Sequence: GTGTGTG... (-28.850) Round 1: -55.74 GTGTGTG T:72.0(37.83%),B:122.0(7.72%),P:1e-24 Round 2: -55.74 GTGTGTG T:72.0(37.83%),B:122.0(7.72%),P:1e-24 =Final=: -13.06 GTGTGTG T:31.0(20.39%),B:125.0(8.22%),P:1e-5 Performing exhaustive masking of motif... Reprioritizing potential motifs... 4 of 5 Initial Sequence: TGCCCGG... (-24.186) Round 1: -37.75 TGCCCGG T:39.0(22.70%),B:59.0(3.81%),P:1e-16 Round 2: -38.05 AGCCCGS T:169.0(67.23%),B:621.0(33.55%),P:1e-16 Round 3: -41.89 TGCCCGG T:48.0(27.15%),B:79.0(5.07%),P:1e-18 Round 4: -41.89 TGCCCGG T:48.0(27.15%),B:79.0(5.07%),P:1e-18 =Final=: -31.99 TGCCCGG T:35.0(23.03%),B:74.0(4.87%),P:1e-13 Performing exhaustive masking of motif... Reprioritizing potential motifs... 5 of 5 Initial Sequence: GTAGCGG... (-16.931) Round 1: -29.22 GTAGCGG T:160.0(65.22%),B:674.0(35.83%),P:1e-12 Round 2: -29.22 GTAGCGG T:160.0(65.22%),B:674.0(35.83%),P:1e-12 =Final=: -23.51 GTAGCGG T:82.0(53.95%),B:436.0(28.68%),P:1e-10 Performing exhaustive masking of motif... Reprioritizing potential motifs... Finalizing Enrichment Statistics (new in v3.4) Reading input files... 1672 total sequences read Cache length = 11180 Using binomial scoring Checking enrichment of 5 motif(s) |0% 50% 100%| ================================================================================= Output in file: /tmp/gimmemotifs.25241.gh5nsl10/Homer.8sq4pbj3/homer_w7.403n_vu3 Cleaning up temporary sequence and group files: 01684388389.seq 01684388389.group 2020-04-30 20:53:42,155 - gimme.scanner - DEBUG - using background: genome sacCer3 with size 200 2020-04-30 20:53:42,447 - gimme.scanner - DEBUG - Determining mean and stddev for motifs. 2020-04-30 20:53:42,611 - gimme.scanner - DEBUG - Determining mean and stddev for motifs. 2020-04-30 20:53:42,802 - gimme.prediction - INFO - Homer_width_8 finished, found 5 motifs 2020-04-30 20:53:42,802 - gimme.prediction - DEBUG - Starting stats job of 5 motifs 2020-04-30 20:53:42,802 - gimme.prediction - DEBUG - stdout : Running command: /home/zhegalova/anaconda3/envs/gimmemotifs/bin/homer2 denovo -i /home/galitsyna/atacseq/atacseq/results/bwa/mergedLibrary/macs/narrowPeak/try-0/intermediate/prediction.fa -b /home/galitsyna/atacseq/atacseq/results/bwa/mergedLibrary/macs/narrowPeak/try-0/intermediate/prediction.bg.fa -len 8 -S 5 -o /tmp/gimmemotifs.25241.gh5nsl10/Homer.xw9z3xwi/homer_w8.xxsxyym9 -p 8 2020-04-30 20:53:42,806 - gimme.prediction - DEBUG - stdout : Number of Trial motifs (-T) set to 8 (from 10) to work well with 8 CPUs Treating input files as FASTA format Scanning input files... Parsing sequences... |0% 50% 100%| =================================================================================== Total number of Oligos: 32256 Autoadjustment for sequence coverage in background: 1.00x Oligos: 32256 of 34497 max Tree : 66064 of 172485 max Optimizing memory usage... Cache length = 11180 Using binomial scoring Global Optimization Phase: Looking for enriched oligos with up to 2 mismatches... Screening oligos 32256 (allowing 0 mismatches): |0% 50% 100%| ================================================================================ 45.17% skipped, 54.83% checked (17687 of 32256), of those checked: 45.17% not in target, 0.00% increased p-value, 0.00% high p-value Screening oligos 32256 (allowing 1 mismatches): |0% 50% 100%| ================================================================================ 45.17% skipped, 54.83% checked (17687 of 32256), of those checked: 0.00% not in target, 30.02% increased p-value, 29.99% high p-value Screening oligos 32256 (allowing 2 mismatches): |0% 50% 100%| ================================================================================ 81.64% skipped, 18.36% checked (5921 of 32256), of those checked: 0.00% not in target, 7.26% increased p-value, 0.00% high p-value Reading input files... 1672 total sequences read Cache length = 11180 Using binomial scoring Local Optimization Phase: 1 of 5 Initial Sequence: GCCGCGCG... (-97.866) Round 1: -115.70 GCCGCGCG T:288.0(85.06%),B:470.0(26.60%),P:1e-50 Round 2: -127.31 GCSGCGCG T:207.0(74.50%),B:272.0(16.39%),P:1e-55 Round 3: -131.11 GCSSSGCG T:199.0(73.11%),B:248.0(15.06%),P:1e-56 Round 4: -131.11 GCSSSGCG T:199.0(73.11%),B:248.0(15.06%),P:1e-56 =Final=: -63.67 GCSSSGCG T:70.0(46.05%),B:162.0(10.66%),P:1e-27 Performing exhaustive masking of motif... Reprioritizing potential motifs... 2 of 5 Initial Sequence: GGGAGGGG... (-38.962) Round 1: -54.94 GGGAGGGG T:157.0(64.52%),B:438.0(25.04%),P:1e-23 Round 2: -54.94 GGGAGGGG T:157.0(64.52%),B:438.0(25.04%),P:1e-23 =Final=: -25.37 GGGAGGGG T:37.0(24.34%),B:104.0(6.84%),P:1e-11 Performing exhaustive masking of motif... Reprioritizing potential motifs... 3 of 5 Initial Sequence: CTCGCGCG... (-34.118) Round 1: -35.47 CTCGCGCG T:115.0(53.19%),B:384.0(22.33%),P:1e-15 Round 2: -36.85 CTCGCGCG T:89.0(44.43%),B:259.0(15.67%),P:1e-16 Round 3: -36.85 CTCGCGCG T:89.0(44.43%),B:259.0(15.67%),P:1e-16 =Final=: -39.42 CTCGCGCG T:47.0(30.92%),B:111.0(7.30%),P:1e-17 Performing exhaustive masking of motif... Reprioritizing potential motifs... 4 of 5 Initial Sequence: TTACCCTG... (-23.831) Round 1: -47.10 TTACCCTG T:29.0(17.42%),B:21.0(1.37%),P:1e-20 Round 2: -47.10 TTACCCTG T:29.0(17.42%),B:21.0(1.37%),P:1e-20 =Final=: -29.18 TTACCCTG T:12.0(7.89%),B:5.0(0.33%),P:1e-12 Performing exhaustive masking of motif... Reprioritizing potential motifs... 5 of 5 Initial Sequence: CGGAGCGG... (-22.065) Round 1: -36.21 CGGAGCGG T:70.0(37.00%),B:182.0(11.29%),P:1e-15 Round 2: -38.24 CGGAGCGG T:118.0(54.11%),B:384.0(22.33%),P:1e-16 Round 3: -38.24 CGGAGCGG T:118.0(54.11%),B:384.0(22.33%),P:1e-16 =Final=: -31.02 CGGAGCGG T:74.0(48.68%),B:319.0(20.99%),P:1e-13 Performing exhaustive masking of motif... Reprioritizing potential motifs... Finalizing Enrichment Statistics (new in v3.4) Reading input files... 1672 total sequences read Cache length = 11180 Using binomial scoring Checking enrichment of 5 motif(s) |0% 50% 100%| ================================================================================= Output in file: /tmp/gimmemotifs.25241.gh5nsl10/Homer.xw9z3xwi/homer_w8.xxsxyym9 Cleaning up temporary sequence and group files: 01684388389.seq 01684388389.group 2020-04-30 20:53:42,806 - gimme.prediction - DEBUG - waiting for statistics to finish 2020-04-30 20:53:42,826 - gimme.scanner - DEBUG - using background: genome sacCer3 with size 200 2020-04-30 20:53:42,837 - gimme.scanner - DEBUG - Determining mean and stddev for motifs. 2020-04-30 20:53:43,213 - gimme.scanner - DEBUG - Determining mean and stddev for motifs. 2020-04-30 20:53:43,503 - gimme.scanner - DEBUG - Determining mean and stddev for motifs. 2020-04-30 20:53:43,872 - gimme.scanner - DEBUG - Determining mean and stddev for motifs. 2020-04-30 20:53:44,296 - gimme.scanner - DEBUG - Determining mean and stddev for motifs. 2020-04-30 20:53:44,717 - gimme.scanner - DEBUG - Determining mean and stddev for motifs. 2020-04-30 20:53:45,484 - gimme.stats - DEBUG - chunk 1.0 of 1 2020-04-30 20:53:45,537 - gimme.scanner - DEBUG - scanning 611 sequences... 2020-04-30 20:53:45,658 - gimme.scanner - DEBUG - using background: genome sacCer3 with size 200 2020-04-30 20:53:45,659 - gimme.scanner - DEBUG - Determining mean and stddev for motifs. 2020-04-30 20:53:45,710 - gimme.scanner - DEBUG - Determining mean and stddev for motifs. 2020-04-30 20:53:45,810 - gimme.scanner - DEBUG - Determining mean and stddev for motifs. 2020-04-30 20:53:46,123 - gimme.scanner - DEBUG - Determining mean and stddev for motifs. 2020-04-30 20:53:46,325 - gimme.scanner - DEBUG - Determining mean and stddev for motifs. 2020-04-30 20:53:46,587 - gimme.scanner - DEBUG - Determining mean and stddev for motifs. 2020-04-30 20:53:47,054 - gimme.scanner - DEBUG - Determining mean and stddev for motifs. 2020-04-30 20:53:47,414 - gimme.scanner - DEBUG - Determining mean and stddev for motifs. 2020-04-30 20:53:47,859 - gimme.scanner - DEBUG - Determining mean and stddev for motifs. 2020-04-30 20:53:48,462 - gimme.scanner - DEBUG - Determining mean and stddev for motifs. 2020-04-30 20:53:48,990 - gimme.scanner - DEBUG - Determining mean and stddev for motifs. 2020-04-30 20:53:49,875 - gimme.stats - DEBUG - chunk 1.0 of 1 2020-04-30 20:53:49,902 - gimme.scanner - DEBUG - Determining mean and stddev for motifs. 2020-04-30 20:53:49,904 - gimme.scanner - DEBUG - scanning 611 sequences... 2020-04-30 20:53:49,915 - gimme.scanner - DEBUG - using background: genome sacCer3 with size 200 2020-04-30 20:53:49,961 - gimme.scanner - DEBUG - Determining mean and stddev for motifs. 2020-04-30 20:53:50,077 - gimme.scanner - DEBUG - Determining mean and stddev for motifs. 2020-04-30 20:53:50,424 - gimme.scanner - DEBUG - Determining mean and stddev for motifs. 2020-04-30 20:53:50,658 - gimme.scanner - DEBUG - Determining mean and stddev for motifs. 2020-04-30 20:53:50,964 - gimme.scanner - DEBUG - Determining mean and stddev for motifs. 2020-04-30 20:53:51,436 - gimme.scanner - DEBUG - Determining mean and stddev for motifs. 2020-04-30 20:53:51,846 - gimme.scanner - DEBUG - Determining mean and stddev for motifs. 2020-04-30 20:53:52,370 - gimme.scanner - DEBUG - Determining mean and stddev for motifs. 2020-04-30 20:53:52,867 - gimme.scanner - DEBUG - Determining mean and stddev for motifs. 2020-04-30 20:53:53,456 - gimme.scanner - DEBUG - Determining mean and stddev for motifs. 2020-04-30 20:53:54,359 - gimme.stats - DEBUG - chunk 1.0 of 1 2020-04-30 20:53:54,397 - gimme.scanner - DEBUG - Determining mean and stddev for motifs. 2020-04-30 20:53:54,405 - gimme.scanner - DEBUG - scanning 611 sequences... 2020-04-30 20:53:54,418 - gimme.scanner - DEBUG - using background: genome sacCer3 with size 200 2020-04-30 20:53:54,462 - gimme.scanner - DEBUG - Determining mean and stddev for motifs. 2020-04-30 20:53:54,588 - gimme.scanner - DEBUG - Determining mean and stddev for motifs. 2020-04-30 20:53:54,927 - gimme.scanner - DEBUG - Determining mean and stddev for motifs. 2020-04-30 20:53:55,180 - gimme.scanner - DEBUG - Determining mean and stddev for motifs. 2020-04-30 20:53:55,507 - gimme.scanner - DEBUG - Determining mean and stddev for motifs. 2020-04-30 20:53:56,002 - gimme.scanner - DEBUG - Determining mean and stddev for motifs. 2020-04-30 20:53:56,449 - gimme.scanner - DEBUG - Determining mean and stddev for motifs. 2020-04-30 20:53:56,956 - gimme.scanner - DEBUG - Determining mean and stddev for motifs. 2020-04-30 20:53:57,678 - gimme.scanner - DEBUG - Determining mean and stddev for motifs. 2020-04-30 20:53:58,320 - gimme.scanner - DEBUG - Determining mean and stddev for motifs. 2020-04-30 20:53:59,264 - gimme.stats - DEBUG - chunk 1.0 of 1 2020-04-30 20:53:59,284 - gimme.scanner - DEBUG - scanning 611 sequences... 2020-04-30 20:53:59,306 - gimme.scanner - DEBUG - using background: genome sacCer3 with size 200 2020-04-30 20:53:59,455 - gimme.scanner - DEBUG - Scanning 2020-04-30 20:53:59,475 - gimme.scanner - DEBUG - Scanning 2020-04-30 20:53:59,494 - gimme.scanner - DEBUG - Scanning 2020-04-30 20:53:59,537 - gimme.scanner - DEBUG - Scanning 2020-04-30 20:53:59,573 - gimme.scanner - DEBUG - scanning 6110 sequences... 2020-04-30 20:53:59,585 - gimme.scanner - DEBUG - using background: genome sacCer3 with size 200 2020-04-30 20:53:59,625 - gimme.scanner - DEBUG - scanning 6110 sequences... 2020-04-30 20:53:59,639 - gimme.scanner - DEBUG - using background: genome sacCer3 with size 200 2020-04-30 20:53:59,640 - gimme.scanner - DEBUG - scanning 6110 sequences... 2020-04-30 20:53:59,653 - gimme.scanner - DEBUG - using background: genome sacCer3 with size 200 2020-04-30 20:53:59,655 - gimme.scanner - DEBUG - scanning 6110 sequences... 2020-04-30 20:53:59,666 - gimme.scanner - DEBUG - using background: genome sacCer3 with size 200 2020-04-30 20:53:59,961 - gimme.scanner - DEBUG - Scanning 2020-04-30 20:54:00,061 - gimme.scanner - DEBUG - Scanning 2020-04-30 20:54:00,082 - gimme.scanner - DEBUG - Scanning 2020-04-30 20:54:00,131 - gimme.scanner - DEBUG - Scanning 2020-04-30 20:54:00,560 - gimme.stats - DEBUG - calculating statistics 2020-04-30 20:54:00,716 - gimme.stats - DEBUG - calculating statistics 2020-04-30 20:54:00,806 - gimme.stats - DEBUG - calculating statistics 2020-04-30 20:54:00,886 - gimme.stats - DEBUG - calculating statistics 2020-04-30 20:54:01,573 - gimme.prediction - DEBUG - Stats: genomic {'gimme_1_Homer_5_1_ssCGC': {'recall_at_fdr': 0.0, 'fraction_fpr': 0.0032733224222585926, 'score_at_fpr': 1.6464321895485772, 'enr_at_fpr': 0.17241379310344826, 'max_enrichment': 1.4814814814814814, 'phyper_at_fpr': 0.9998467286896049, 'mncp': 1.1147419349085834, 'roc_auc': 0.5851426788206396, 'roc_auc_xlim': 0.002664322608814293, 'pr_auc': 0.1039116204532634, 'max_fmeasure': 0.6918494448957486, 'ks_pvalue': 0.0032283569956327333, 'ks_significance': 2.491018446483821}, 'gimme_2_Homer_5_2_GCGCG': {'recall_at_fdr': 0.0, 'fraction_fpr': 0.016366612111292964, 'score_at_fpr': 2.4969587974719745, 'enr_at_fpr': 0.9090909090909092, 'max_enrichment': 4.354243542435425, 'phyper_at_fpr': 0.6615362716959072, 'mncp': 1.6263623924242232, 'roc_auc': 0.6401863275840363, 'roc_auc_xlim': 0.013389205371413714, 'pr_auc': 0.15141652807383343, 'max_fmeasure': 0.6732673267326732, 'ks_pvalue': 0.06386799595456169, 'ks_significance': 1.194716710889612}, 'gimme_3_Homer_5_3_GGCCG': {'recall_at_fdr': 0.0, 'fraction_fpr': 0.004909983633387889, 'score_at_fpr': 2.2381527526047624, 'enr_at_fpr': 0.3896103896103896, 'max_enrichment': 1.5178571428571428, 'phyper_at_fpr': 0.9806878712650521, 'mncp': 1.169429889361657, 'roc_auc': 0.5749518510879377, 'roc_auc_xlim': 0.005003985463295283, 'pr_auc': 0.1089544630109261, 'max_fmeasure': 0.6668485675306958, 'ks_pvalue': 0.2488299141649949, 'ks_significance': 0.6040974102519641}, 'gimme_4_Homer_5_4_nGsGs': {'recall_at_fdr': 0.0, 'fraction_fpr': 0.0425531914893617, 'score_at_fpr': 1.269303164435028, 'enr_at_fpr': 3.3766233766233764, 'max_enrichment': 4.545454545454546, 'phyper_at_fpr': 1.0009761771287182e-06, 'mncp': 1.323672618406123, 'roc_auc': 0.5634328098338963, 'roc_auc_xlim': 0.009572057291178371, 'pr_auc': 0.12185858150825828, 'max_fmeasure': 0.6769406392694064, 'ks_pvalue': 0.0002738240999761263, 'ks_significance': 3.562528331125182}, 'gimme_5_Homer_5_5_GTGTG': {'recall_at_fdr': 0.0, 'fraction_fpr': 0.008183306055646482, 'score_at_fpr': 1.766600348877708, 'enr_at_fpr': 0.18315018315018317, 'max_enrichment': 1.6633186018481319, 'phyper_at_fpr': 0.9999999422313427, 'mncp': 1.0804413849197085, 'roc_auc': 0.5553143273483142, 'roc_auc_xlim': 0.0018653231018728772, 'pr_auc': 0.10114851175654987, 'max_fmeasure': 0.6667757952747312, 'ks_pvalue': 0.00023325631970209403, 'ks_significance': 3.632166580916657}} 2020-04-30 20:54:01,745 - gimme.prediction - DEBUG - Stats: genomic {'gimme_6_Homer_6_1_GssGCG': {'recall_at_fdr': 0.0, 'fraction_fpr': 0.009819967266775777, 'score_at_fpr': 2.268389959631275, 'enr_at_fpr': 0.9090909090909092, 'max_enrichment': 2.5296442687747036, 'phyper_at_fpr': 0.6493575250931465, 'mncp': 1.4734478998815208, 'roc_auc': 0.6288943027582161, 'roc_auc_xlim': 0.009685119061969012, 'pr_auc': 0.1353320119310367, 'max_fmeasure': 0.6722689075630253, 'ks_pvalue': 0.2722890169137306, 'ks_significance': 0.5649698760671574}, 'gimme_7_Homer_6_2_CssCCC': {'recall_at_fdr': 0.0, 'fraction_fpr': 0.009819967266775777, 'score_at_fpr': 2.662281410838917, 'enr_at_fpr': 0.8695652173913044, 'max_enrichment': 2.0118343195266273, 'phyper_at_fpr': 0.6884997839656567, 'mncp': 1.293470316288715, 'roc_auc': 0.5956288288095232, 'roc_auc_xlim': 0.00848385169867219, 'pr_auc': 0.11856765425648452, 'max_fmeasure': 0.6689374057736334, 'ks_pvalue': 0.3948389112597998, 'ks_significance': 0.403580054298514}, 'gimme_8_Homer_6_3_wGCGnG': {'recall_at_fdr': 0.0, 'fraction_fpr': 0.02618657937806874, 'score_at_fpr': 1.9785383801600716, 'enr_at_fpr': 2.077922077922078, 'max_enrichment': 3.6040609137055837, 'phyper_at_fpr': 0.0090237606521801, 'mncp': 1.461436903759879, 'roc_auc': 0.5983551688761147, 'roc_auc_xlim': 0.011299928998504752, 'pr_auc': 0.13468399515730528, 'max_fmeasure': 0.6692406692406693, 'ks_pvalue': 0.5036306474248041, 'ks_significance': 0.2978878496473919}, 'gimme_9_Homer_6_4_kGTGTG': {'recall_at_fdr': 0.0, 'fraction_fpr': 0.0032733224222585926, 'score_at_fpr': 2.3014861519968606, 'enr_at_fpr': 0.16666666666666666, 'max_enrichment': 1.702395964691047, 'phyper_at_fpr': 0.9998927506443075, 'mncp': 1.1454341339528784, 'roc_auc': 0.5442092729849111, 'roc_auc_xlim': 0.00478311560292617, 'pr_auc': 0.10558839967799794, 'max_fmeasure': 0.6691531373409391, 'ks_pvalue': 0.19154444745913754, 'ks_significance': 0.7177304329411931}, 'gimme_10_Homer_6_5_GGGTAA': {'recall_at_fdr': 0.0, 'fraction_fpr': 0.004909983633387889, 'score_at_fpr': 2.208736850300779, 'enr_at_fpr': 0.29411764705882354, 'max_enrichment': 2.5306122448979593, 'phyper_at_fpr': 0.997174054709443, 'mncp': 1.298304749767953, 'roc_auc': 0.5536303877895967, 'roc_auc_xlim': 0.009550846877933663, 'pr_auc': 0.1205719146743844, 'max_fmeasure': 0.6692935235029691, 'ks_pvalue': 0.021781366425103345, 'ks_significance': 1.661914878837917}} 2020-04-30 20:54:01,891 - gimme.prediction - DEBUG - Stats: genomic {'gimme_11_Homer_7_1_sCGCGsn': {'recall_at_fdr': 0.0, 'fraction_fpr': 0.03927986906710311, 'score_at_fpr': 2.7031698761248566, 'enr_at_fpr': 3.8709677419354835, 'max_enrichment': 4.528301886792453, 'phyper_at_fpr': 3.5875174102581436e-07, 'mncp': 1.7089508232690014, 'roc_auc': 0.6574521658304784, 'roc_auc_xlim': 0.01227837883930094, 'pr_auc': 0.15586531342784335, 'max_fmeasure': 0.6800054075976747, 'ks_pvalue': 0.04132745343405507, 'ks_significance': 1.3837613547863097}, 'gimme_12_Homer_7_2_sGGrGnG': {'recall_at_fdr': 0.0, 'fraction_fpr': 0.009819967266775777, 'score_at_fpr': 2.3734264911388783, 'enr_at_fpr': 0.9523809523809524, 'max_enrichment': 2.0552147239263805, 'phyper_at_fpr': 0.6073699190793794, 'mncp': 1.305480306993171, 'roc_auc': 0.5967096680872492, 'roc_auc_xlim': 0.008707921600981465, 'pr_auc': 0.11888753809255136, 'max_fmeasure': 0.6753567816608608, 'ks_pvalue': 0.32715703939282575, 'ks_significance': 0.4852437306107061}, 'gimme_13_Homer_7_3_GTAGCGG': {'recall_at_fdr': 0.0, 'fraction_fpr': 0.01309328968903437, 'score_at_fpr': 2.772642458036875, 'enr_at_fpr': 1.2903225806451613, 'max_enrichment': 2.1428571428571432, 'phyper_at_fpr': 0.30223600325761457, 'mncp': 1.1751755029388449, 'roc_auc': 0.5590534687306634, 'roc_auc_xlim': 0.0061066749526546855, 'pr_auc': 0.10704724384820677, 'max_fmeasure': 0.6675435711936862, 'ks_pvalue': 0.584382156274397, 'ks_significance': 0.2333030534322594}, 'gimme_14_Homer_7_4_wGCCCGG': {'recall_at_fdr': 0.0, 'fraction_fpr': 0.02127659574468085, 'score_at_fpr': 2.211882052025955, 'enr_at_fpr': 1.9696969696969697, 'max_enrichment': 4.166666666666667, 'phyper_at_fpr': 0.024812681669602165, 'mncp': 1.4179867282523073, 'roc_auc': 0.615673107058001, 'roc_auc_xlim': 0.008568229486152669, 'pr_auc': 0.1292664130409116, 'max_fmeasure': 0.6751154802795215, 'ks_pvalue': 0.46888239386386665, 'ks_significance': 0.3289360743443744}, 'gimme_15_Homer_7_5_GTGTGTG': {'recall_at_fdr': 0.0, 'fraction_fpr': 0.0032733224222585926, 'score_at_fpr': 2.9763933947705885, 'enr_at_fpr': 0.3773584905660377, 'max_enrichment': 2.967032967032967, 'phyper_at_fpr': 0.9661210776488782, 'mncp': 1.29490230957275, 'roc_auc': 0.5607558910428292, 'roc_auc_xlim': 0.009817824338839765, 'pr_auc': 0.11895442094564967, 'max_fmeasure': 0.6679075878681704, 'ks_pvalue': 0.8408545513070929, 'ks_significance': 0.0752791207629557}} 2020-04-30 20:54:02,036 - gimme.prediction - DEBUG - Stats: genomic {'gimme_16_Homer_8_1_nnsnsnCn': {'recall_at_fdr': 0.0, 'fraction_fpr': 0.031096563011456628, 'score_at_fpr': 2.758962354936089, 'enr_at_fpr': 3.064516129032258, 'max_enrichment': 4.166666666666667, 'phyper_at_fpr': 8.554363753266096e-05, 'mncp': 1.6351467689590409, 'roc_auc': 0.6409008601177004, 'roc_auc_xlim': 0.011674135663410311, 'pr_auc': 0.14883952413243381, 'max_fmeasure': 0.6695810967849041, 'ks_pvalue': 0.9119071163751599, 'ks_significance': 0.04004939511461384}, 'gimme_17_Homer_8_2_CTCGCGCG': {'recall_at_fdr': 0.0, 'fraction_fpr': 0.02127659574468085, 'score_at_fpr': 2.721630241332524, 'enr_at_fpr': 2.0967741935483875, 'max_enrichment': 4.0, 'phyper_at_fpr': 0.016609511630748863, 'mncp': 1.2704331703481724, 'roc_auc': 0.5684161887490926, 'roc_auc_xlim': 0.00776744410306412, 'pr_auc': 0.1154543982701069, 'max_fmeasure': 0.6668994413407822, 'ks_pvalue': 0.6372664451717927, 'ks_significance': 0.1956789483966163}, 'gimme_18_Homer_8_3_CGGAGCGG': {'recall_at_fdr': 0.0, 'fraction_fpr': 0.02618657937806874, 'score_at_fpr': 2.4925487886673863, 'enr_at_fpr': 2.5, 'max_enrichment': 3.333333333333334, 'phyper_at_fpr': 0.0019330759874455709, 'mncp': 1.2577952996864987, 'roc_auc': 0.5578705725099846, 'roc_auc_xlim': 0.008360365476359485, 'pr_auc': 0.11445599311793539, 'max_fmeasure': 0.6671033955672017, 'ks_pvalue': 0.3033178680370632, 'ks_significance': 0.5181020052471033}, 'gimme_19_Homer_8_4_GGGAGGGG': {'recall_at_fdr': 0.0, 'fraction_fpr': 0.02127659574468085, 'score_at_fpr': 2.7951054983942067, 'enr_at_fpr': 2.0967741935483875, 'max_enrichment': 10.0, 'phyper_at_fpr': 0.016609511630748863, 'mncp': 1.4269277665195492, 'roc_auc': 0.5830156889111515, 'roc_auc_xlim': 0.010849644140029624, 'pr_auc': 0.12981113157583224, 'max_fmeasure': 0.6698092523569393, 'ks_pvalue': 0.3773666053923275, 'ks_significance': 0.42323653484288276}, 'gimme_20_Homer_8_5_TTACCCTG': {'recall_at_fdr': 0.0, 'fraction_fpr': 0.05073649754500818, 'score_at_fpr': 2.9143739464669585, 'enr_at_fpr': 4.025974025974026, 'max_enrichment': 14.285714285714288, 'phyper_at_fpr': 3.293631164001038e-09, 'mncp': 1.49143456875555, 'roc_auc': 0.5815578282496832, 'roc_auc_xlim': 0.010860894511693689, 'pr_auc': 0.13709375265478393, 'max_fmeasure': 0.6669941597074396, 'ks_pvalue': 0.19150838387264513, 'ks_significance': 0.7178122086931337}} 2020-04-30 20:54:04,038 - gimme.prediction - INFO - predicted 20 motifs 2020-04-30 20:54:04,038 - gimme.prediction - DEBUG - written to try-0/intermediate/all_motifs.pfm 2020-04-30 20:54:04,040 - gimme.denovo - INFO - 10 motifs are significant 2020-04-30 20:54:04,040 - gimme.denovo - DEBUG - written to try-0/intermediate/significant_motifs.pfm 2020-04-30 20:54:04,187 - gimme.cluster - INFO - clustering 10 motifs. 2020-04-30 20:54:09,462 - gimme.cluster - DEBUG - Clustering done. See the result in try-0/gimme.clustereds.html 2020-04-30 20:54:09,595 - gimme.scanner - DEBUG - using background: genome sacCer3 with size 200 2020-04-30 20:54:09,637 - gimme.scanner - DEBUG - Determining mean and stddev for motifs. 2020-04-30 20:54:09,665 - gimme.scanner - DEBUG - Determining mean and stddev for motifs. 2020-04-30 20:54:09,679 - gimme.scanner - DEBUG - Determining mean and stddev for motifs. 2020-04-30 20:54:09,695 - gimme.scanner - DEBUG - Determining mean and stddev for motifs. 2020-04-30 20:54:10,021 - gimme.scanner - DEBUG - Determining mean and stddev for motifs. 2020-04-30 20:54:10,043 - gimme.scanner - DEBUG - Determining mean and stddev for motifs. 2020-04-30 20:54:10,068 - gimme.scanner - DEBUG - Determining mean and stddev for motifs. 2020-04-30 20:54:10,095 - gimme.scanner - DEBUG - Determining mean and stddev for motifs. 2020-04-30 20:54:10,124 - gimme.scanner - DEBUG - Determining mean and stddev for motifs. 2020-04-30 20:54:10,246 - gimme.scanner - DEBUG - Determining mean and stddev for motifs. 2020-04-30 20:54:10,286 - gimme.scanner - DEBUG - Determining mean and stddev for motifs. 2020-04-30 20:54:10,456 - gimme.stats - DEBUG - chunk 1.0 of 1 2020-04-30 20:54:10,538 - gimme.scanner - DEBUG - scanning try-0/intermediate/validation.fa... 2020-04-30 20:54:10,551 - gimme.scanner - DEBUG - using background: genome sacCer3 with size 200 2020-04-30 20:54:10,641 - gimme.scanner - DEBUG - Scanning 2020-04-30 20:54:10,741 - gimme.scanner - DEBUG - scanning try-0/intermediate/bg.genomic.fa... 2020-04-30 20:54:10,829 - gimme.scanner - DEBUG - using background: genome sacCer3 with size 200 2020-04-30 20:54:11,119 - gimme.scanner - DEBUG - Scanning 2020-04-30 20:54:11,381 - gimme.stats - DEBUG - calculating statistics 2020-04-30 20:54:12,006 - gimme.report - INFO - creating de novo reports 2020-04-30 20:54:12,163 - gimme.scanner - DEBUG - using background: genome sacCer3 with size 200 2020-04-30 20:54:12,167 - gimme.scanner - DEBUG - using background: genome sacCer3 with size 200 2020-04-30 20:54:12,170 - gimme.scanner - DEBUG - using background: genome sacCer3 with size 200 2020-04-30 20:54:12,171 - gimme.scanner - DEBUG - using background: genome sacCer3 with size 200 2020-04-30 20:54:12,172 - gimme.scanner - DEBUG - using background: genome sacCer3 with size 200 2020-04-30 20:54:12,172 - gimme.scanner - DEBUG - using background: genome sacCer3 with size 200 2020-04-30 20:54:12,173 - gimme.scanner - DEBUG - using background: genome sacCer3 with size 200 2020-04-30 20:54:12,439 - gimme.stats - DEBUG - chunk 1.0 of 1 2020-04-30 20:54:12,449 - gimme.scanner - DEBUG - scanning try-0/intermediate/validation.fa... 2020-04-30 20:54:12,457 - gimme.stats - DEBUG - chunk 1.0 of 1 2020-04-30 20:54:12,460 - gimme.scanner - DEBUG - using background: genome sacCer3 with size 200 2020-04-30 20:54:12,474 - gimme.scanner - DEBUG - scanning try-0/intermediate/validation.fa... 2020-04-30 20:54:12,483 - gimme.stats - DEBUG - chunk 1.0 of 1 2020-04-30 20:54:12,485 - gimme.scanner - DEBUG - using background: genome sacCer3 with size 200 2020-04-30 20:54:12,491 - gimme.scanner - DEBUG - scanning try-0/intermediate/validation.fa... 2020-04-30 20:54:12,500 - gimme.stats - DEBUG - chunk 1.0 of 1 2020-04-30 20:54:12,502 - gimme.scanner - DEBUG - using background: genome sacCer3 with size 200 2020-04-30 20:54:12,508 - gimme.scanner - DEBUG - scanning try-0/intermediate/validation.fa... 2020-04-30 20:54:12,518 - gimme.scanner - DEBUG - using background: genome sacCer3 with size 200 2020-04-30 20:54:12,537 - gimme.stats - DEBUG - chunk 1.0 of 1 2020-04-30 20:54:12,547 - gimme.scanner - DEBUG - scanning try-0/intermediate/validation.fa... 2020-04-30 20:54:12,556 - gimme.stats - DEBUG - chunk 1.0 of 1 2020-04-30 20:54:12,557 - gimme.scanner - DEBUG - using background: genome sacCer3 with size 200 2020-04-30 20:54:12,565 - gimme.scanner - DEBUG - scanning try-0/intermediate/validation.fa... 2020-04-30 20:54:12,618 - gimme.stats - DEBUG - chunk 1.0 of 1 2020-04-30 20:54:12,618 - gimme.scanner - DEBUG - using background: genome sacCer3 with size 200 2020-04-30 20:54:12,632 - gimme.scanner - DEBUG - scanning try-0/intermediate/validation.fa... 2020-04-30 20:54:12,642 - gimme.scanner - DEBUG - using background: genome sacCer3 with size 200 2020-04-30 20:54:12,839 - gimme.scanner - DEBUG - Scanning 2020-04-30 20:54:12,871 - gimme.scanner - DEBUG - Scanning 2020-04-30 20:54:12,890 - gimme.scanner - DEBUG - scanning try-0/intermediate/bg.genomic.fa... 2020-04-30 20:54:12,898 - gimme.scanner - DEBUG - scanning try-0/intermediate/bg.genomic.fa... 2020-04-30 20:54:12,939 - gimme.scanner - DEBUG - Scanning 2020-04-30 20:54:12,949 - gimme.scanner - DEBUG - Scanning 2020-04-30 20:54:12,965 - gimme.scanner - DEBUG - using background: genome sacCer3 with size 200 2020-04-30 20:54:12,968 - gimme.scanner - DEBUG - Scanning 2020-04-30 20:54:12,971 - gimme.scanner - DEBUG - using background: genome sacCer3 with size 200 2020-04-30 20:54:12,986 - gimme.scanner - DEBUG - Scanning 2020-04-30 20:54:12,992 - gimme.scanner - DEBUG - scanning try-0/intermediate/bg.genomic.fa... 2020-04-30 20:54:13,020 - gimme.scanner - DEBUG - scanning try-0/intermediate/bg.genomic.fa... 2020-04-30 20:54:13,022 - gimme.scanner - DEBUG - scanning try-0/intermediate/bg.genomic.fa... 2020-04-30 20:54:13,024 - gimme.scanner - DEBUG - scanning try-0/intermediate/bg.genomic.fa... 2020-04-30 20:54:13,064 - gimme.scanner - DEBUG - Scanning 2020-04-30 20:54:13,088 - gimme.scanner - DEBUG - using background: genome sacCer3 with size 200 2020-04-30 20:54:13,094 - gimme.scanner - DEBUG - using background: genome sacCer3 with size 200 2020-04-30 20:54:13,099 - gimme.scanner - DEBUG - using background: genome sacCer3 with size 200 2020-04-30 20:54:13,100 - gimme.scanner - DEBUG - using background: genome sacCer3 with size 200 2020-04-30 20:54:13,102 - gimme.scanner - DEBUG - scanning try-0/intermediate/bg.genomic.fa... 2020-04-30 20:54:13,229 - gimme.scanner - DEBUG - using background: genome sacCer3 with size 200 2020-04-30 20:54:13,439 - gimme.scanner - DEBUG - Scanning 2020-04-30 20:54:13,456 - gimme.scanner - DEBUG - Scanning 2020-04-30 20:54:13,522 - gimme.scanner - DEBUG - Scanning 2020-04-30 20:54:13,554 - gimme.stats - DEBUG - calculating statistics 2020-04-30 20:54:13,555 - gimme.scanner - DEBUG - Scanning 2020-04-30 20:54:13,562 - gimme.stats - DEBUG - calculating statistics 2020-04-30 20:54:13,590 - gimme.scanner - DEBUG - Scanning 2020-04-30 20:54:13,595 - gimme.scanner - DEBUG - Scanning 2020-04-30 20:54:13,619 - gimme.stats - DEBUG - calculating statistics 2020-04-30 20:54:13,633 - gimme.scanner - DEBUG - Scanning 2020-04-30 20:54:13,684 - gimme.stats - DEBUG - calculating statistics 2020-04-30 20:54:13,720 - gimme.stats - DEBUG - calculating statistics 2020-04-30 20:54:13,725 - gimme.stats - DEBUG - calculating statistics 2020-04-30 20:54:13,763 - gimme.stats - DEBUG - calculating statistics 2020-04-30 20:54:21,544 - gimme.report - DEBUG - Creating localization plots 2020-04-30 20:54:21,544 - gimme.report - DEBUG - GimmeMotifs_1 GimmeMotifs_1_sCGCGsn 2020-04-30 20:54:21,757 - gimme.report - DEBUG - GimmeMotifs_2 GimmeMotifs_2_GCGCG 2020-04-30 20:54:21,940 - gimme.report - DEBUG - GimmeMotifs_3 GimmeMotifs_3_GGGAGGGG 2020-04-30 20:54:22,135 - gimme.report - DEBUG - GimmeMotifs_4 GimmeMotifs_4_TTACCCTG 2020-04-30 20:54:22,321 - gimme.report - DEBUG - GimmeMotifs_5 GimmeMotifs_5_CTCGCGCG 2020-04-30 20:54:22,509 - gimme.report - DEBUG - GimmeMotifs_6 GimmeMotifs_6_nGsGs 2020-04-30 20:54:22,688 - gimme.report - DEBUG - GimmeMotifs_7 GimmeMotifs_7_CGGAGCGG 2020-04-30 20:54:22,870 - gimme.report - DEBUG - Creating graphical report 2020-04-30 20:54:32,864 - gimme.denovo - DEBUG - Deleting intermediate files. Please specifify the -k option if you want to keep these files. 2020-04-30 20:54:32,866 - gimme.denovo - INFO - finished 2020-04-30 20:54:32,866 - gimme.denovo - INFO - output dir: try-0 2020-04-30 20:54:32,866 - gimme.denovo - INFO - de novo report: try-0/gimme.denovo.html 2020-04-30 20:55:02,274 - gimme.motifs - INFO - creating motif scan tables 2020-04-30 20:55:02,428 - gimme.scanner - DEBUG - using background: genome sacCer3 with size 352 2020-04-30 20:55:02,510 - gimme.scanner - INFO - using 10000 sequences 2020-04-30 20:55:03,509 - gimme.scanner - DEBUG - Determining mean and stddev for motifs. 2020-04-30 20:55:03,567 - gimme.scanner - DEBUG - Determining mean and stddev for motifs. 2020-04-30 20:55:03,624 - gimme.scanner - DEBUG - Determining mean and stddev for motifs. 2020-04-30 20:55:03,683 - gimme.scanner - DEBUG - Determining mean and stddev for motifs. 2020-04-30 20:55:03,892 - gimme.scanner - DEBUG - Determining mean and stddev for motifs. 2020-04-30 20:55:03,983 - gimme.scanner - DEBUG - Determining mean and stddev for motifs. 2020-04-30 20:55:04,387 - gimme.scanner - DEBUG - Determining mean and stddev for motifs. 2020-04-30 20:55:04,514 - gimme.scanner - DEBUG - Determining mean and stddev for motifs. 2020-04-30 20:55:04,782 - gimme.scanner - DEBUG - Determining mean and stddev for motifs. 2020-04-30 20:55:05,224 - gimme.scanner - DEBUG - Determining mean and stddev for motifs. 2020-04-30 20:55:05,690 - gimme.scanner - DEBUG - Scanning 2020-04-30 20:55:06,128 - gimme.scanner - DEBUG - using background: genome sacCer3 with size 200 2020-04-30 20:55:06,783 - gimme.scanner - DEBUG - Scanning 2020-04-30 20:55:07,422 - gimme.motifs - INFO - calculating stats 2020-04-30 20:55:07,463 - gimme.stats - DEBUG - chunk 1.0 of 1 2020-04-30 20:55:07,490 - gimme.stats - DEBUG - calculating statistics 2020-04-30 20:55:08,827 - gimme.comparison - DEBUG - Selected 5 motifs for feature elimination 2020-04-30 20:55:08,919 - gimme.comparison - INFO - selecting non-redundant motifs 2020-04-30 20:55:09,324 - gimme.comparison - INFO - selected 1 non-redundant motifs: ROC AUC 0.625, PR AUC 0.145 2020-04-30 20:55:09,437 - gimme.scanner - DEBUG - using background: genome sacCer3 with size 352 2020-04-30 20:55:09,490 - gimme.scanner - INFO - determining FPR-based threshold 2020-04-30 20:55:09,838 - gimme.scanner - DEBUG - Scanning 2020-04-30 20:55:09,858 - gimme.motifs - INFO - creating statistics report 2020-04-30 20:55:12,090 - gimme.motifs - INFO - gimme motifs final report: try-0/gimme.motifs.html