I have a loop that will search for my content everywhere, I also have another loop to search a specific sub, but I cannot find a way to do both filters in the same loop.

‘’‘for submission in reddit.subreddit(“sub”).hot(limit=10000)’‘’

‘’‘for comment in redditor.comments.hot(limit=10000):’‘’

The problem is the 1000 limit, if I try to refine the content with python while in the loop, then these 1000 results will miss the target, and 99% of those results will be comments of other people.

The result is that a lot of my comments won’t be touched. I can see a lot of it in search engines.

How did you do to remove as many comments as possible? I know you can also sort by controversial but I was wondering is there is a PRAW finesse that I could use here.

  • abff08f4813c@kbin.social
    link
    fedilink
    arrow-up
    1
    ·
    1 year ago

    It’s a reddit limitation. There’s no index for your comments in a specific sub just indexes on your comments generally and on the sub itself.

    The way around this used to be to use the Pushshift api - which held a copy of everything and had no limits. Since that was shut down by reddit you now need to do more work.

    I recently outlined how to do this here, https://kbin.social/m/RedditMigration/t/65260/PSA-Here-s-exactly-what-to-do-if-you-hit-the

    • PabloDiscobar@kbin.socialOP
      link
      fedilink
      arrow-up
      1
      ·
      1 year ago

      Thanks, I did it yesterday… and I crashed my hard drive. It’s a 1.2TB archive and my ssd couldn’t even handle the IO of the empty entries alone.

      I was still able to count my posts in /r/linux, more than 1400. It goes fast.

      • abff08f4813c@kbin.social
        link
        fedilink
        arrow-up
        1
        ·
        1 year ago

        Ouch. Sorry about the crash. If you try it again keep in mind you can download individual subs without downloading the whole torrent, that’s what I did - and because of my specific subs I was able to keep it under 1 gig.