Navigation

    The Onewheel Forum

    Onewheel Forum

    • Register
    • Login
    • Search
    • Categories
    • Recent
    • Popular
    • Users
    • Groups
    • Rules
    • Archive

    Organising the Archive

    The Archive
    archive old forum
    8
    121
    10177
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Lia
      Lia GT XR Pint Plus V1 DIY @biell last edited by

      @biell That looks perfect, the missing post bit is great too. Thanks for implementing it :)

      I should be able to install PERL. Worst case I can fire up a VM for whatever OS I need to run it.

      @biell said in Organising the Archive:

      If you have any information you can provide about how you pulled down the content so far

      The way I did it was really messy and manual. I spent weeks going to google and entering "site:community.onewheel.com/topic/XXXXX". I'd manually have to increment my search +1 to search whatever google had for that topic. Once I got results I'd click the options next to the link and try to browse a cache if it existed, Ctrl+S then add the page ID the URL gave in brackets to the file name. With nearly 10000 topic ID's at the time of the forum closing it took me forever D: After 10-20 searches google would make me do a captcha to check I wasn't a robot which made things slower. Pain.jpg >.> I did it this way because google did a cache a day or 2 just before the site got replaced with the maintenance message and was worried if I didn't act fast they'd send the cache robot to update itself again and lose all the data. Didn't think I had time t learn how to automate this.

      .

      I'll get to work uploading it soon, would you like a name or some sort of credit placed anywhere on the archive for helping resolve this mammoth part of the job?

      B 2 Replies Last reply Reply Quote 1
      • B
        biell Plus GT DIY @Lia last edited by

        @lia I thought of another way of doing the missing post, I like it better. What do you think?

        missing.png

        Lia 1 Reply Last reply Reply Quote 2
        • Lia
          Lia GT XR Pint Plus V1 DIY @biell last edited by Lia

          @biell That's a much better idea. Good thinking :)
          I think there might be a few topics where there are a lot of missing posts so that'll help keep it tidy.

          1 Reply Last reply Reply Quote 2
          • B
            biell Plus GT DIY @Lia last edited by

            @lia I am glad you did the work when you did, I tried to do a search the way you describe, and Google doesn't have anything. I tried archive.org and did have much luck there past the first page of results.

            I don't need any credit.

            BTW, this is what running it will look like for the missing articles and coverage percentage:

            $ ./forum_archive topic-20
            topic/20/france-onewheel-riders :
             Missing: 83-136 160-169 249-253 274-281 302-339 382-397 514-545 566-588 687-704 880-898 919 940-943 1040-1041 1062-1072 1110-1111 1244-1253 1287-1312 1333-1339 1360-1374 1395-1412 1455-1459 1487-1492 1522-1528 1560-1571 1682 1732-1734 1797-1800 1834-1842 1902-1996 2017-2025 2093-2101 2128-2139 2176-2193 2228-2238 2272-2275 
             Coverage: 76%
            $
            

            So, if you have a bunch of folders all containing different topics under a single location, you could just run forum_archive * and it will loop through each, putting all the data under either assets or topic depending on the data type.

            I will make the change for the simpler, less obtrusive missing posts method. I will take a pass through the script, clean up the code a bit, add some documentation, and send it over. The script is short, so even not knowing perl you would be able to read through it before running it, to verify you trust it. It should also be easy to change the broken image link icon if you don't like the one I selected.

            Lia 1 Reply Last reply Reply Quote 2
            • Lia
              Lia GT XR Pint Plus V1 DIY @biell last edited by

              @biell I can find a few things on archive.org but I found searching with it was extra tedious so I'm planning to find the rest if I can through it since that wasn't time sensitive.

              If you're sure, thank you so much for the effort you put in <3

              That's really helpful. I'll cast an eye through it, might learn a thing or two as I do :)

              B 1 Reply Last reply Reply Quote 1
              • B
                biell Plus GT DIY @Lia last edited by

                @lia The Perl script can be found here:

                https://drive.google.com/file/d/1UoQaB3_wzojQOilSgkt-TNamNYDRqwLU/view?usp=sharing

                Basically, I assume you have a folder somewhere and it contains folders, each of which look like the zip file you sent me. If so, you could litterally just run forum_archive * in that location and it would read through all the sub-folders, then neatly pack everything under a assets and a topic folder, with you just having to move those two directory structures into your webserver's DocRoot.

                I don't think I used any Perl modules you wouldn't find standard.

                Please let me know if you have any questions.

                Lia 2 Replies Last reply Reply Quote 2
                • Lia
                  Lia GT XR Pint Plus V1 DIY @biell last edited by Lia

                  @biell Thank you :)
                  I'll hopefully give it a whirl this weekend since it's raining here anyway so plenty of indoor time to work with.

                  Thank you for making this :)

                  1 Reply Last reply Reply Quote 2
                  • Lia
                    Lia GT XR Pint Plus V1 DIY @biell last edited by

                    @biell Finally found time to give it a whirl. Not sure if it's Windows or what I'm doing but I can't get it to work properly. Probably something I'm doing no doubt >.>

                    Using StrawberryPerl on a Windows10 Laptop.
                    Renamed the file to forum_archive.pl as windows couldn't figure out what to do with it in cmd.
                    Script exists in a directory (Forum) and inside with the script is a folder called RAW where I'm dumping all the cache scrapings.


                    Here is a Tree of that if it's easier to follow (note 2 and 22 were separated for testing later).
                    29467eed-8a23-4948-bfd1-6fc585be6c78-image.png

                    If I run "forum_archive.pl *" it complains it wants a specific directory.
                    dd12c32c-a8c8-4f12-9829-cdad286f6c53-image.png

                    "forum_archive.pl RAW" runs it but then states an illegal division by zero on line 451 occurred when trying to give the "missing" field.
                    fbc5f3f0-f21b-4bb3-b892-7d73bd8058ef-image.png

                    If I place the topics in their own directory rather than all in the same one it runs and completes but doesn't move any of the images to the resources directory.
                    cf802014-c863-4d40-be46-a9d7e22e4260-image.png
                    b375bcba-121f-4523-8de1-c82f9336f6f6-image.png
                    d557b8c0-03cb-4d93-8488-a462e522f133-image.png

                    If a topic only has a single entry to run it gives the division by zero error again. I assume this might be what is causing the "forum_archive.pl RAW" command to fail.
                    db77e607-e850-454d-9e1d-c2b0bf502e1d-image.png


                    I've zipped up exactly what I have currently with a selection of files in RAW. I've left topic 2 and 22 as separated so you can see what worked and what didn't.
                    Forum.zip

                    B 1 Reply Last reply Reply Quote 1
                    • B
                      biell Plus GT DIY @Lia last edited by

                      @lia I will look into this. I see a couple things. Firstly, I made a couple mistakes because I only had a single topic to test against (oops). Second, I was expecting everything to be organized like your "2" and "22" directories, I may have to do some finagling to handle multiple topics in the same directory.

                      Also, I see that topic "2" is causing an issue, so I will work through that now that I have more data.

                      Please just give me a few days to iron this out. Sorry about that. I may need some consistency between either the organization like topic 22 or topic 23.

                      OneDanGTS Lia 2 Replies Last reply Reply Quote 3
                      • OneDanGTS
                        OneDanGTS GT-S GT Pint X XR @biell last edited by

                        @biell said in Organising the Archive:

                        Firstly, I made a couple mistakes

                        What??? Mistakes in software??? This can not be!!! /s ... I'm a retired software engineer :D

                        GT-S > GT > Pint X > XR > +

                        1 Reply Last reply Reply Quote 3
                        • Lia
                          Lia GT XR Pint Plus V1 DIY @biell last edited by

                          @biell Thank you :) no rush on it at all, been caught up in a few things so haven’t been able to focus on much.

                          My fault for not providing more test data, a single example was a bit dim of me to provide.

                          B 1 Reply Last reply Reply Quote 2
                          • Swinefeaster
                            Swinefeaster last edited by

                            i had a post about my flightfin switch install. is that saved in the archive somewhere by any chance?

                            Lia 2 Replies Last reply Reply Quote 1
                            • Lia
                              Lia GT XR Pint Plus V1 DIY @Swinefeaster last edited by

                              @swinefeaster One of the ones I specifically looked for when I started the project. I have it.
                              79ec6143-2c37-46d1-8482-fc9ea5a1826e-image.png

                              Shall be up soon, I might manually do this one since it's only 29 posts long.

                              1 Reply Last reply Reply Quote 0
                              • Lia
                                Lia GT XR Pint Plus V1 DIY @Swinefeaster last edited by

                                @swinefeaster Patched it up and uploaded it :)
                                70e64f63-cc22-4c54-ac8f-011bd5fcef93-image.png
                                Linky

                                Swinefeaster 1 Reply Last reply Reply Quote 2
                                • Swinefeaster
                                  Swinefeaster @Lia last edited by

                                  @lia you're awesome thanks!

                                  1 Reply Last reply Reply Quote 2
                                  • B
                                    biell Plus GT DIY @Lia last edited by

                                    @lia Just a quick update on this. I didn't get a chance to start digging in until yesterday. I have mostly fixed everything. Your current directory structure with mixed folders/not-folders is working, and all the uploads are getting sorted properly.

                                    When troubleshooting the div-by-zero error, I found that it was happening because something was moved to the bottom of the HTML on some pages (and so my $COUNT variable wasn't getting initialized. Well, it turns out some pages (e.g. topic 99) have a COMPLETELY different format with a <div class="post-bar">. These pages have a total lack of proper CSS styling (avatars are messed up, the timeline doesn't show, ...).

                                    Some topics have all their pages like this, but most don't. For example, topic 54 only has one file in this weird format: 54 (34) Sounds and vibration _ Onewheel Forum.html .

                                    So, I am going through all that now to get these pages to render as close to the rest of the archive as possible. I think I have everything done except to copy the information from the bottom of the page back up to the top (posters, posts, and views).

                                    B 1 Reply Last reply Reply Quote 2
                                    • B
                                      biell Plus GT DIY @biell last edited by

                                      $ ./forum_archive.pl RAW
                                      topic/2/welcome-to-the-onewheel-forum, Total: 3, Coverage: 100%, Missing: None
                                      topic/20/france-onewheel-riders, Total: 2276, Coverage: 80%, Missing: 0-47 83-120 160-169 249-253 274-281 302-339 384-397 523-527 566-581 687-704 880-892 940-943 1040-1041 1062-1072 1110-1111 1244-1253 1287-1312 1333-1339 1455-1459 1487-1492 1834-1842 1902-1996 2017-2025 2128-2139 2176-2193 2228-2238 2272-2275
                                      topic/22/android-app, Total: 58, Coverage: 82%, Missing: 0-9
                                      topic/23/petition-bring-back-extreme-shaping-1-0, Total: 26, Coverage: 100%, Missing: None
                                      topic/30/how-do-you-do-that-quick-180-spin, Total: 20, Coverage: 100%, Missing: None
                                      topic/31/footpad-order-page-down, Total: 69, Coverage: 69%, Missing: 0-15 64-68
                                      topic/33/airline-travel-battery-watt-hours, Total: 8, Coverage: 100%, Missing: None
                                      topic/34/classic-mode-dangerous, Total: 23, Coverage: 100%, Missing: None
                                      topic/35/second-day-riding, Total: 2, Coverage: 100%, Missing: None
                                      topic/36/riding-in-airports, Total: 28, Coverage: 100%, Missing: None
                                      topic/40/anyone-else-have-an-issue-connecting-with-iphone-4, Total: 3, Coverage: 100%, Missing: None
                                      topic/41/camera-and-mounting-tips, Total: 22, Coverage: 90%, Missing: 20-21
                                      topic/42/email-notifications, Total: 35, Coverage: 51%, Missing: 0-16
                                      topic/44/is-it-possible-to-mount-the-softer-yellow-vega-tire, Total: 10, Coverage: 100%, Missing: None
                                      topic/46/a-very-short-trail-ride-in-sf-and-lombard, Total: 19, Coverage: 100%, Missing: None
                                      topic/47/ower-in-k%C3%B6ln-und-umgebung, Total: 3, Coverage: 100%, Missing: None
                                      topic/48/riding-vs-kiting, Total: 2, Coverage: 100%, Missing: None
                                      topic/49/riders-in-hawaii-oahu, Total: 5, Coverage: 100%, Missing: None
                                      topic/52/bindings, Total: 24, Coverage: 100%, Missing: None
                                      topic/53/can-you-re-grip, Total: 61, Coverage: 45%, Missing: 0-14 43-60
                                      topic/54/sounds-and-vibration, Total: 34, Coverage: 100%, Missing: None
                                      topic/55/car-adapter-charger, Total: 17, Coverage: 100%, Missing: None
                                      topic/58/onewheel-travel-bag, Total: 10, Coverage: 100%, Missing: None
                                      topic/61/mud-guard, Total: 31, Coverage: 83%, Missing: 26-30
                                      topic/67/onewheel-clothing-and-accesories, Total: 13, Coverage: 100%, Missing: None
                                      topic/73/the-waiting-game, Total: 133, Coverage: 82%, Missing: 20-33 54-57 118-122
                                      topic/77/apple-watch-app-is-cool-but-one-issue, Total: 17, Coverage: 100%, Missing: None
                                      topic/78/incentive-referral-program-suggestion, Total: 11, Coverage: 100%, Missing: None
                                      topic/83/vertical-performance, Total: 4, Coverage: 100%, Missing: None
                                      topic/88/european-charger, Total: 3, Coverage: 100%, Missing: None
                                      topic/90/scratches-on-your-onewheel-frame-rails, Total: 49, Coverage: 83%, Missing: 0-7
                                      topic/92/any-london-england-riders, Total: 8, Coverage: 100%, Missing: None
                                      topic/96/worried-about-buying-a-v1, Total: 10, Coverage: 100%, Missing: None
                                      topic/97/nyc-riding-on-bridges, Total: 16, Coverage: 100%, Missing: None
                                      topic/98/cap-for-charging-port, Total: 29, Coverage: 41%, Missing: 0-16
                                      topic/99/trying-not-to-be-rude, Total: 31, Coverage: 64%, Missing: 0-10
                                      $ 
                                      

                                      My run (I included topic 20): https://drive.google.com/file/d/1jlOmlSBb-Hi2Dimhn5URw6tfA8Uro-U1/view

                                      The updated script: https://drive.google.com/file/d/1FFmb1LVADMPUvIMumuJdX50xRmqIq7iM/view

                                      Note that the script is a little hacked up now to deal with copying the header up to the top when it is at the bottom.

                                      Lia 1 Reply Last reply Reply Quote 2
                                      • Lia
                                        Lia GT XR Pint Plus V1 DIY @biell last edited by

                                        @biell Thank you so much :) Taken a peak at your run and it looks good, will give that a whirl myself with the bulk of it and see how it gets on. Might need to discard some posts as you point out a handful are a bit busted so I can manually do them later if needed :)

                                        Really love the output for the stitching. Will give me a reference of what I need to look at and what's good to set in stone.

                                        1 Reply Last reply Reply Quote 1
                                        • Lia
                                          Lia GT XR Pint Plus V1 DIY last edited by Lia

                                          Code works. Thanks @biell
                                          Squee.gif
                                          It's churned through all the topics. Total is just over 1000 currently which is pretty decent.
                                          Made a really simple .bat to run the perl script and spit out the console into a txt so I have the page stats to hand for later refining :3

                                          However I am now manually adding the links to those pages onto the homepage.
                                          Wish me luck...


                                          9767aa7d-d586-4b9a-8270-637723fb9a3a-image.png
                                          65cc8971-44c3-4c43-8156-8c28c171cb5e-image.png


                                          Updating this as I go

                                          Topics 0-9600 links have been added
                                          Only what exists of course. (1180/1180)

                                          B 1 Reply Last reply Reply Quote 5
                                          • B
                                            biell Plus GT DIY @Lia last edited by

                                            @lia Great news, I have been worried about this for a bit. If you run into any issues, don't hesitate to reach out.

                                            Lia 1 Reply Last reply Reply Quote 3
                                            • First post
                                              Last post