I went to NCBI and found a large file. A file I thought my application might struggle with. It took time, but it processed! I imagine if this application were to actually be used, it should be hosted somewhere with a large amount of memory to be able to get through these sequences. Anyway, look, here’s some screenshots of what happened.
The contig itself is from: https://www.ncbi.nlm.nih.gov/Traces/wgs/?val=JOVY01
In particular, Campylobacter coli strain CVM N287 N287_contig_8, which is 207270 characters long.
I’m still working on ‘Superframe’, but you can see how most of the GC Content percentage regions that were outside of the mean threshold are all within the ORF Locations, which is exactly what I hoped to see.
The next step is to find another few contigs, smaller in size for time sake, run them individually to see that their GC content % regions match with their ORF Locations, and then start mixing them up together where I know there should be differences in the GC Content % and see if I can view this after processing the mixed contig data.