Organising the Archive
-
Code works. Thanks @biell
It's churned through all the topics. Total is just over 1000 currently which is pretty decent.
Made a really simple .bat to run the perl script and spit out the console into a txt so I have the page stats to hand for later refining :3However I am now manually adding the links to those pages onto the homepage.
Wish me luck...
Updating this as I go
Topics 0-9600 links have been added
Only what exists of course. (1180/1180) -
@lia Great news, I have been worried about this for a bit. If you run into any issues, don't hesitate to reach out.
-
@biell I'm steamrolling adding the links to the homepage now they're practically all there :) Thank you for getting me to the home stretch! Hope you enjoy seeing your contribution in action as much as I am with all the pages pretty much being restored :D
Sorry the delay worried you. The past month has been brutal D:
I think I somehow broke 2 bits but not sure what, maybe me editing some bits busted the formatting.
Timestamps aren't generating and some avatars are being broken :( The script mentions line 344 has a regex error or something.The slightly modified script is below. I altered the value for "logo_ht" from 80 to 60 and changed the image on my end to a newer version.
#!/usr/bin/perl =head1 NAME forum-archive - Put a google chached community.onewheel.com thread back together =cut use IO::Handle; use HTTP::Request; use LWP::UserAgent; use File::Copy; use File::Path; use File::Find; use Pod::Usage; use POSIX; my($HEADER, @POSTS, $FOOTER, $COUNT, $WEIRD); my(%META)=( 'base' => 'https://archive.owforum.co.uk/', 'logo' => 'http://archive.owforum.co.uk/Images/OWForumArchive.png', 'logo_ht' => '60', 'profiles' => '../../../assets/uploads/profile', 'resources' => '../../../assets/resources', 'system' => '../../../assets/uploads/system', ); my(%RESOURCES)=( 'fonts' => 'https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.15.4/css/all.min.css', 'style' => 'https://owforum.co.uk/assets/client-darkly.css', 'icon' => 'https://owforum.co.uk/assets/uploads/system/favicon.ico', 'broken' => 'https://icon-icons.com/downloadimage.php?id=5390&root=39/PNG/128/&file=brokenfile_5952.png', ); my(%TOPICS)=(); my(%SEARCH_PATH)=(); =head1 SYNOPSIS forum_archive <directory1> [directory2] [directory3] [...] =head1 DESCRIPTION This script takes a set of files downloaded from the google page cache for the community.onewheel.com NodeBB forum and tries to put it back together. =head2 Content Management A small number of resources are available from the internet. B<forum_archive> will download these assets, if needed, for inclusion to the archive file structure. If assets have been downloaded recently (within the last day) then new downloads are not attempted. This should keep B<forum_archive> from slamming remote resources during testing phases. =cut sub wget { my($src, $asset, $dst)=@_; my($file)=IO::Handle->new; my($req)=HTTP::Request->new( 'GET' => $src ); my($get)=LWP::UserAgent->new; my($response); $asset=~s|^[/.]+||; if(!-s "$asset/$dst" || -M "$asset/$dst" > 1 ) { $response=$get->request($req); if($response->is_success) { File::Path::make_path($asset, { 'chmod' => 0755 }); open($file, '>', "$asset/$dst"); print $file $response->decoded_content; close($file); } } } =pod B<forum_archive> will dynamically create the C<assets> and C<topic> directory structures as needed to store content found within the post. In an effort to increase efficiency for commonly used content, such as avatars, actual copying of these files will not occur each time such content is seen, after its initial copy. =cut sub location { my($file)=@_; my($try); if(-r $file) { return($file); } else { foreach my $dir (keys(%SEARCH_PATH)) { $try=join('/', $dir, $file); return($try) if(-r $try); } } &wget($RESOURCES{'broken'}, $META{'resources'}, 'broken-file.png'); return(join('/', $META{'resources'}, 'broken-file.png')); } sub copy { my($src, $asset, $dst)=@_; $asset=~s|^[/.]+||; if(!-s "$asset/$dst" || -M "$asset/$dst" > 1 ) { File::Path::make_path($asset, { 'chmod' => 0755 }); File::Copy::copy(&location($src), "$asset/$dst"); } } =pod Avatar image are stored in a central location, shared by the entire archive. This lowers the space requirements of the archive and increases page load times and browser cache efficiency. =cut sub avatar { my($img)=@_; my($dst)=$img; $dst=~s|^.*/||; ©($img, $META{'profiles'}, $dst); return(join('/', $META{'profiles'}, $dst)); } =pod Uploaded images which are stored in the archive may be named slightly differently on the archive than on the original. NodeBB has gone through a couple iterations about how to handle this conflict, and B<forum_archive> tries to handle this by using the more unique C<ALT> tag element parameter name. When that doesn't work, the original name is kept. Images are also grouped by post, to avoid naming conflicts between different posts. Additionally, if an image is referenced in a post, but is not contained in the archive, a standard broken file image is substituded. =cut sub upload { my($src, $alt)=@_; if($alt=~m/\.\w+$/) { ©($src, $META{'path'}, $alt); return(sprintf('<img src="%s" alt="%s"', $alt, $alt)); } else { my($new)=$src; $new=~s|^.*/||; ©($src, $META{'path'}, $new); return(sprintf('<img src="%s" alt="%s"', $new, $alt)); } } =head2 Archive Display One major change in the archive from the original is the banner. The original banner is replaced by one tailored to the archive, to set it apart from the original forum and make it clear it is a wholly different entity. =cut sub banner { my($start, $img, $end)=@_; return($start.qq{ <div class="container"> <div class="navbar-header"> <a href="http://archive.owforum.co.uk"> <img alt="The Archive homepage" src="$META{'logo'}" height="$META{'logo_ht'}"> </a> </div> <div class="navbar-header pull-right"> <p class="text-right" style="padding-top: 10px"> This page is an archived copy of the old Onewheel Forum. </p> </div> </div> }.$end); } =pod One of the differences which makes the archive different is, unfortunately, that some posts are missing. When this occurs, B<forum_archive> inserts a break in the timeline with a note about the message IDs which are absent. =cut sub missing_post { my($id)=@_; return(qq{ <li component="topic/necro-post" class=" necro-post timeline-event" data-index="$id"> <small class="timeline-text">Post(s) $id missing from the archive</small> </li> }); } =pod An interactive, HTML5 based NodeBB forum requires a lot of javascript to work. Since the archive is a static copy of that data, all of the javascript is removed, and the archive works nearly identically on all platforms. =cut sub global { s|https?://community.onewheel.com/|$META{'base'}|sg; s|<noscript>.*?</noscript>||sg; s|<script>.*?</script>||sg; s|<script .*?></script>||sg; s|\s+<div component="topic/reply/container" .*?</div>||s; s|\s+<a component="topic/reply/guest" .*?</a>||m; s|class="posts"|class="posts timeline"|mg; s|\n\n<hr>\n||sg; if(m|<span component="topic/post-count".*?>(\d+)</span>|m) { $COUNT=$1; } } =pod B<forum_archive> assumes that all the headers from all the source files are identical, and uses the first one it finds. With that content, the new banner is inserted, interactive metadata and buttons are removed, and the new style is setup. B<forum_archive> also collects important information like the page path and total message count. =cut sub header { local($_)=@_; #Cleanup to a reasonable starting header only s/(<ul component="topic" class="posts timeline" .*?>\s+).*$/$1/s; s/(<body .*?>).*$/$1/m; #Grab some important info if(m|<link rel="canonical" href="($META{'base'}(.*?))">|) { $META{'url'}=$1; $META{'path'}=$2; } #reset links #strip out unneeded content s|(<meta property="og:url" content=".*?)/\d+\?.*?">|$1">|mg; s|\s+<meta name="msapplication-\w+" .*?>||sg; s|\s+<link rel="icon" sizes=.*?>||sg; s|\s+<link rel="prefetch" .*?>||sg; s|\s+<link rel="prefetch stylesheet" .*?>||sg; s|\s+<link rel="manifest" .*?>||sg; s|\s+<link rel="search" .*?>||sg; s|\s+<link rel="apple-touch-icon" .*?>||sg; s|\s+<link rel="alternate" .*?>||sg; s|\s+<link rel="next" .*?>||sg; s|\s+<link rel="prev" .*?>||sg; &wget($RESOURCES{'icon'}, $META{'resources'}, 'favicon.ico'); s|(<link rel="icon" type="image/x-icon" href=").*?">|$1$META{'resources'}/favicon.ico">|mg; &wget($RESOURCES{'style'}, $META{'resources'}, 'client-darkly.css'); s|<link rel="stylesheet" .*?>|<link rel="stylesheet" href="$META{'resources'}/client-darkly.css">\n\t<link rel="stylesheet" href="$RESOURCES{'fonts'}">|s; if(m|forum-logo" src="(.*?/site-logo.png)"|m) { ©($1, $META{'system'}, 'site-logo.png'); s|forum-logo" src=".*?"|forum-logo" src="$META{'system'}/site-logo.png"|mg; } s|(<h1 component="post/header" .*?)>|$1 style="padding-top: 50px;">|m; s|\s+<section class="menu-section".*?</section>||s; #Insert new banner s|(<nav class="navbar navbar-default navbar-fixed-top header".*?>).*?<img alt="Onewheel Home Page" class=" forum-logo" src="(.*?)">.*?</nav>|&banner($1, $2, '</nav>')|se; #Remove unnecessary buttons s|\s+<a class="hidden-xs" target="_blank".*rss.*</a>||mg; s|\s+<div title="Sort by" .*?</div>||s; s|<li>[^RL]+<span>Register</span>.*?</li>||gs; s|<li>[^RL]+<span>Login</span>.*?</li>||gs; s|<a component="topic/reply/guest" .*?</a>\s*||s; s|<ol class="breadcrumb">.*?</ol>||s; s|<span class="hidden-xs">Loading More Posts</span> <i .*?</i>||mg; s|class="slideout-panel" style=".*?"|class="slideout-panel"|m; return($_); } =pod A lot of cleanup occurs within each forum post. Firstly, and with the javascript removed, all times are calculated and coded directly in UTC. Interactive buttons are removed, and links to content (such as user pages) not contained in the archive are also removed. Other interactive content (e.g. online status) is removed, too. Media, such as avatars and uploaded content is collected and placed properly into the new archive filesystem structures. =cut sub post { local($_)=@_; my($time); if(m/data-timestamp="(\d+)"/s) { $time=POSIX::strftime("%e %B %Y, %H:%M UTC", gmtime($1/1000)); s|(><span class="timeago") title="(.+?)">|$1 title="$time" datetime="$2">$time|sg; } s|<span class="replies-last .*</span>||mg; s|<a component="post/parent" .*?>(.*?)</a>|$1|mg; s|<i component="user/status" .*?></i>||mg; s|<a href=".*?/user/.*?">(.*?)</a>|<span class="btn-link">$1</span>|sg; s|<a class="plugin-mentions-user .*?>(.*?)</a>|<span class="btn-link">$1</span>|mg; s|<a href="[^"]+/user/.*?">\s+(<span class="avatar.*?>)\s+</a>|$1|sg; s|(?<= component="user/picture" src=")([^"]+)|&avatar($1)|meg; s|(?<= component="avatar/picture" src=")([^"]+)|&avatar($1)|meg; s|<img src="(.*?)" alt="(.*?)"(?= \s*class="\s*img-responsive)|&upload($1, $2)|meg; s|<a (component="post/reply-count".*? href=").*?/(\d+)[?#].*?(".*?)>|<a $1#$2$3>|mg; s|\s+<i component="post/edit-indicator".*?</i>||mg; s|\s+<i class="fa fa-fw fa-chevron-right".*?</i>||mg; s|\s+<i class="fa fa-fw fa-chevron-down hidden".*?</i>||mg; s|\s+<i class="fa fa-fw fa-spin fa-spinner hidden".*?</i>||mg; s|\s+<small class="pull-right">\s+<span class="bookmarked">.*?</span>\s+</small>||sg; s|(?<= class="avatar" src=")([^"]+)|&avatar($1)|meg; s|(?<= component="user/picture" data-uid="\d{1,5}" src=")([^"]+)|&avatar($1)|meg; s|(<img component="user/picture")|$1 class="avatar avatar-sm2x avatar-rounded"|mg; s|(data-uid="\d+") class="user-icon"|$1 class="avatar avatar-sm2x avatar-rounded"|mg; s|(title="\w+") class="user-icon"|$1 class="avatar avatar-xs avatar-rounded"|mg; s|id="[^"]*google-cache-hdr"||sg; s|This is Google's cache of||sg; return($_); } =pod Similarly to the header, the HTML after all the posts is based on the first file seen and removes some of the content better suited to an interactive stite than a static, archive site. =cut sub footer { local($_)=@_; s|<div class="progress-bar"></div>||s; s|<div class="spinner" role="spinner"><div .*?</div></div>||s; s|<div id="nprogress">.*?</div></div></div>||s; return($_); } =head2 Data Import Process Each downloaded F<.html> file from the forum is read and separated into 3 sections, a header, a list of posts, and a footer. The first file's header will be processed and used as the archive files header, same with the footer. Each post is pulled into an array. If a post occurs in multiple downloaded cache files, then the last one read is kept. Each one is processed and prepared for the final archive topic. =cut sub ingest { my($source)=@_; my($html)=IO::Handle->new; my($move, $category_url, $category); local($/)="\n\t\t\t\t</li>\n\t\t\t"; open($html, '<', $source); while(<$html>) { &global; if(m/^<!DOCTYPE html>/) { if(m/class="nprogress-busy"/) { $WEIRD=0; } else { $WEIRD=1; } if(!$HEADER || !$WEIRD) { $HEADER=&header($_); } s/^.*<ul component="topic" class="posts timeline" .*?>\s+\n//s; } if(m|</html>$|) { if(!$FOOTER || !$WEIRD) { $FOOTER=&footer($_); if($FOOTER=~s|(<div class="post-bar">.*\n<hr>\n\t\t</div>)||s) { $move=$1; ($category)=($HEADER=~m|<meta property="article:section" content="(.*?)">|m); ($category_url)=($HEADER=~m|<link rel="up" href="(.*?)">|m); $category=~s/&/&/g; $HEADER=~s|</h1>\n|</h1>\n$move|s; $HEADER=~s|<div class="tags pull-left">.*<div class="topic-main-buttons pull-right">|<div class="topic-main-buttons pull-left"><a href="$category_url">$category</a>|s; $HEADER=~s|class="stats hidden-xs"|class="stats text-muted"|mg; $HEADER=~s|(<span component="topic/post-count" class="human-readable-number" title="\d+">\d+</span>)<br>\s+<small>Posts</small>|<i class="fa fa-fw fa-pencil" title="Posts"></i>$1|s; $HEADER=~s|(<span class="human-readable-number" title="\d+">\d+</span>)<br>\s+<small>Views</small>|<i class="fa fa-fw fa-eye" title="Views"></i>$1|s; } } last; } if(m/data-index="(\d+)"/) { $POSTS[$1]=&post($_); } } close($html); } =head2 Execution The script expects a directory structure of HTML files which have valid links to media files. Other than that, it is pretty agnostic about the structure of the directory. It will read the header to find out what the name of the document should be, create it, and write to it. In addition to processing archived posts, a special post is inserted for anything missing. B<forum_archive> will also produce a report on F<STDOUT> with information on missing posts. =cut sub process { my($html)=IO::Handle->new; my($posts, $total)=(0, 0); my(@missing)=(); $HEADER=""; @POSTS=(); $FOOTER=""; $COUNT=0; foreach my $entry (@_) { &ingest($entry); } for(my $i=0; $i<$COUNT; $i++) { if(!exists($POSTS[$i])) { my($begin); for($begin=$i; !exists($POSTS[$i+1]) && $i<$COUNT; $i++) { $posts++; $total++; } if($i==$begin) { $POSTS[$i]=&missing_post($i); } else { $POSTS[$i]=&missing_post("$begin-$i"); push(@missing, "$begin-$i"); } $posts++; } $total++; } if($total) { printf("%s, Total: %d, Coverage: %d%%, Missing: %s\n", $META{'path'}, $total, (1-$posts/$total)*100, join(' ', @missing) || 'None'); } else { printf("%s, Total: %d\n", $META{'path'}, $total); } File::Path::make_path($META{'path'}, { 'chmod' => 0755 }); open($html, '>', join('/', $META{'path'}, 'index.html')); print $html $HEADER; print $html @POSTS; print $html $FOOTER; close($html); } if($ARGV[0] =~ m/^-+h/i) { pod2usage(-verbose => 2, -exitval => 0); } elsif(! -d $ARGV[0]) { pod2usage(-verbose => 1, -exitval => 0); } find(sub { $File::Find::prune=1 if(m/^assets$/); $File::Find::prune=1 if(m/^topic$/); if(m/(\d+)\s+.*\.html$/) { push(@{$TOPICS{$1}}, $File::Find::name); $SEARCH_PATH{$File::Find::dir}=1; } }, @ARGV); foreach my $topic (sort({ $a <=> $b } keys(%TOPICS))) { &process(sort(@{$TOPICS{$topic}})); } =head1 NOTES B<forum_archive> is basically a conglomeration of regular expressions. This is by no means the best way to manage and manipulate complext HTML files. However, given the static nature of this content and its relative complexity, using regular expressions requires a substantially smaller code base and interpretation of the original source files. Essentially, in this case, it is too much easier to strip out the junk you know you don't want than to understand the entire document schema fully enough to make the meaningful changes the right way. =pod
-
@lia Line 344 does have to do with avatars
s|(?<= component="user/picture" data-uid="\d{1,5}" src=")([^"]+)|&avatar($1)|meg;
That should be OK. Maybe the version of perl you are using has an issue with variable width look-behind, but I purposely used
\d{1,5}
instead of\d+
so it wouldn't be infinite (which should cause an issue).What version of perl
perl -v
do you have?Edit: misspelled "width"
-
@biell Look-behind was the message it gave, well caught.
I'm using Strawberry Perl 5.32.1.1-64bit on a Windows10 machineLooks like my version of perl doesn't work with either. Gave
\d+a
a shot and it failed to run. What version do you run and I'll see if I can use that. Might explain why timestamps didn't load either :3 -
@lia said in Organising the Archive:
@biell Look-behind was the message it gave, well caught.
I'm using Strawberry Perl 5.32.1.1-64bit on a Windows10 machineLooks like my version of perl doesn't work with either. Gave
\d+a
a shot and it failed to run. What version do you run and I'll see if I can use that. Might explain why timestamps didn't load either :3virtualized/containerized i hope. with a proxy. @biell
-
@notsure Work laptop, completely isolated from everything including work since I don't have the VPN hooked up.
I use the laptop as a little sandbox that I reformat every now and then when it gets a little funky :)Back to adding more links to the homepage. Dropped off at 4am
last night...this morning? Oh dear more energy drinks needed. -
@lia said in Organising the Archive:
I use the laptop as a little sandbox that I reformat every now and then when it gets a little funky :)
vm. containerize. proxy. time-gated, off-site backups. @biell !!!
-
So going through these posts I'm seeing a ton of amazing pics that unless you actually go digging through threads most will never see. like look at this one from DreamTour in the "Your Pint Shipping Date".
Anyone think it might be a cool idea to have a gallery page on the archive that carousels some random pics while also giving access to a gallery of pics ranging in time uploaded?
-
@lia I added two fixes here: https://drive.google.com/file/d/1FFmb1LVADMPUvIMumuJdX50xRmqIq7iM/view
I assume Strawberry perl is having trouble with the zero-width look-behind, so I removed it and replaced it with a slightly less efficient construct.
I am also assuming that Strawberry perl doesn't have a proper POSIX::strftime (Windows isn't POSIX compliant, so this wouldn't be surprising). I just added my own little function to do the same thing for this specific instance:
my(@MONTH)=qw( January February March April May June July August September October November December ); sub utctime { my($epoch)=int($_[0]/1000); my($sec, $min, $hr, $day, $month, $year, $wd, $jd, $dst)=gmtime($epoch); return(sprintf("%d %s %d, %02d:%02d UTC", $day, $MONTH[$month], $year+1900, $hr, $min)); }
You should be able to just rerun this version and it will overwrite the old files with fixed ones. Topic 22 had examples of the messed up avatar.
Also, LET'S GO, SPURS!!!
-
@biell Amazing, it worked :D Thank you so much!
I can't seem to get the favicon to work for some reason.
Tried to strip out thetype="image/x-icon"
from the below but that seems to cause the favicon section to then pull extra text and break.s|(<link rel="icon" type="image/x-icon" href=").*?">|$1$META{'resources'}/favicon.ico">|mg;
Spent an hour trying to figure it out but I'm no good with interpreting code lol.
Currently the end result is the following line:
<link rel="icon" type="image/x-icon" href="../../../assets/resources/favicon.ico">
Could it be tweaked to instead generate:
<link rel="icon" href="../../../assets/resources/OWForumArchiveIcon.png">
-
-
@lia So, that was my bad. I was downloading to
assets/resources
, but looking for it inassets/system/uploads
:(Updated now so that the
META
config at the top points to the location you want for'icon'
. This way you can change it in the future if you want to, e.g. put it in/Images
with OWForumArchive.png.https://drive.google.com/file/d/1FFmb1LVADMPUvIMumuJdX50xRmqIq7iM/view
-
@biell Ah found the issue, there's something in the topmost tags that's busting the favicon.
Surrounding the top <style> tag is some <body> and <head> which although don't mention an icon I think they're causing the browser to ignore the tag later. Used topic 20 to test as an example below.
Starts with a <html> tag that we can keep
<!DOCTYPE html> <html lang="en-US" data-dir="ltr" style="direction: ltr;" class="nprogress-busy">
Then the element that seems to break it starts.
<head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><!--<base href="https://archive.owforum.co.uk/topic/20/france-onewheel-riders/990?page=110&lang=en-US">--><base href=".">
In between is a style tag that we need to keep as removing this bit actually breaks some elements.
<style>body{margin-left:0;margin-right:0;margin-top:0}#bN015htcoyT__google-cache-hdr{background:#f8f9fa;font:13px arial,sans-serif;text-align:left;color:#202124;border:0;margin:0;border-bottom:1px solid #dadce0;line-height:16px;padding:16px 28px 24px 28px}#bN015htcoyT__google-cache-hdr *{display:inline;font:inherit;text-align:inherit;color:inherit;line-height:inherit;background:none;border:0;margin:0;padding:0;letter-spacing:0}#bN015htcoyT__google-cache-hdr a{text-decoration:none;color:#1a0dab}#bN015htcoyT__google-cache-hdr a:hover{text-decoration:underline}#bN015htcoyT__google-cache-hdr a:visited{color:#4b11a8}#bN015htcoyT__google-cache-hdr div{display:block;margin-top:4px}#bN015htcoyT__google-cache-hdr b{font-weight:bold;display:inline-block;direction:ltr}</style>
Then a closing section to the earlier <head> and <body> which needs removing.
</head><body class="page-topic page-topic-20 page-topic-france-onewheel-riders page-topic-category-2 page-topic-category-general-discussion parent-category-2 page-status-200 user-guest skin-noskin">
I've "repaired" topic 20 is you want to take a peak
https://archive.owforum.co.uk/topic/20/france-onewheel-riders/index.html -
@lia Sorry, I didn't realize all that junk was up there. Firefox was honoring a lot of that meta content outside of HEAD, which it shouldn't have, so I didn't notice it. The latest version with a fix (same google drive link) is updated and ready.
Are you sure that
<style>...</style>
section is necessary, what did it break? I tried wiping all of it out and couldn't find any issues. -
@lia I also just uploaded a version where the dates are no longer "permalink" items to post links which don't exist.
-
@biell If I inspect the page with F12 and remove the style a few errors show up so presumably that section is referenced. Probably not worth trying to find and remove the references so I think keeping that small <style> section will be fine.
-
@lia I removed the post hamburger menu and made the upvote/downvote chevrons no longer links (if you clicked on any them the do nothing except take you to the top of the page).
Also, in doing that, I noticed that sometimes the timestamp is to the left because different posts have different HTML layouts. So, I fixed that too.
Same google drive link.
-
@biell That's amazing thank you!
I noticed that sometimes the timestamp is to the left because different posts have different HTML layouts. So, I fixed that too
I did notice this but felt it wasn't worth bothering you over. Really appreciate you going out of your way to fix that. I assume that's from the pages that didn't get saved normally.
Currently re-running the script and preparing to replace the live data after.Finished and swapped out to the new data, looks good!!!Just got to make my way though adding the links. So many that I might paginate the homepage a little. Anyone finding any issues with load times or overall experience so far using the archive?
-
@lia For the main page, I feel like it should have a filter so you can type in something and search for topics. If you are interested, I did a mock-up here:
https://drive.google.com/file/d/1KCtjxqxqH3dJZPpe7QhAcb4v0qUWGS6c/view
I just threw this together in a few minutes, so the input box is literally just thrown on there in some semi-reasonable place. Essentially, as you type, it goes through the list items and changes anything which doesn't match to "display: none" and anything which does match to "display: block".
You can ignore the
<base>
tag, as I just needed that to make my copy work. Then, I have a new<style>
which should be moved to your style.css, and a<script>
which can stay there or go into it's own file. After that, I added anid
to your<ul>
element so I could find it, and put the<input>
element on the page to type into.