Episode 54

Jekyll and CMS-less websites with Young Hahn and Dave Cole

June 27, 2013

After many years of using Content Management Systems that store content in a database, there's a movement to store content in files instead. Jekyll and other tools, including GitHub, are springing up to create a new ecosystem of file-based tools. Young Hahn and Dave Cole join Jen Simmons to explain.

Transcript

Thanks to Matt Sugihara for transcribing for this episode.

Jen
This is the Web Ahead, a weekly conversation about changing technology and the future of the web. I'm your host Jen Simmons's and this is episode 54. I want to say first thank you so much to the sponsor of today's show, Environment for Human's CSS Summit. CSS Summit 2013 a three-day online conference that's coming up in July. I'll talk more about that later. So today I thought we'd talk about Jekyll. Jekyll’s a word that I started hearing more and more floating around in the universe, the web design development universe, and like many things you know you hear about this that and the other, there's 20,000 different tools coming out all the time but for some reason it kept coming up and more and more people who I really respect and really have learned a lot from especially when it comes to best practices of websites and you know how you make a big enterprise website how you make a website for a big company or a website with a lot of content. More and more people are looking at this thing called Jekyll. So I had my eye on it now for a couple years, but I just thought this might be a good time to talk about, not just this one particular tool but about all these issues that it raises around how do we make a website; like what's the technology platform underneath behind-the-scenes that actually creates the HTML pages, how does that happen, we've been doing in a certain way for a while and maybe it's not the original way, we started out doing it a different way, but maybe now there's a third way or fifth way whatever this that we're gonna start seeing used more and more and more, really getting away from using a database to store content and storing content and files. So I'm always looking for ok, here's a topic, who's the right person to come talk about this it's just the folks at Development Seed. Development Seed is the ship in DC that I met Young Han, who's on the show,. Hi Young.
Young
Hey Jen
Jen
And other folks from Development Seed several years ago at Drupalcon because we were all working in the Drupal space at the time and I was just so blown away by how smart you guys all are. Just like smart. So you've been doing a lot of work around this issue of moving away from content in databases and big heavy CMS into content stored in files, so I thought I ping you all. And you've brought along with you Dave Cole. Hi Dave.
Dave
Hey Jen how are you?
Jen
If feels like every time I'm googling something about Jekyll there's a blog post and it's written by you so
Dave
Yeah we're trying to put as much of our internal process out on the web so people can have good documentation for what we're doing.
Jen
Yeah so why don't we start by you all just talking about, what is Jekyll and what is it that you've been doing moving away from CMS database storage of content slash Drupal into what you doing now.
Dave
Yeah it's a great time for the conversation we just relaunched healthcare.gov which is the government signature website for information about the Affordable Care Act and how people can get prepared to get new affordable insurance plans coming online at the end of this year. So it's a traditional content heavy site. There's about 1,200 posts and has all the latest features that we would look for in a modern site such as responsive design, making sure that there's an interactive components such as where people can enter information and get customized results back so kind of a typical project, but we wanted to go about it in a different way. We used this as kind of a capstone project to test out a log of the static website methodology that were doing. So very simply what this means is you take the content and the layout and you keep it separate in separate files but you don't put anything into the database and you don't build any application around [it]. You just run a program that takes that content and checks into your layout template files and then generate static HTML. Or it can generate JSON or JavaScript whatever static component you want but at the end of the process have a folder that contains everything you need to run your website just off of a standard Web server.
Jen
In some ways it feels like a return back to what the web started out as. I mean when the web was first invented there was no CSS there was no JavaScript you made HTML files. And if you had one webpage you had one file. One actual physical file on the hard drive. If you had 27 webpages you had 27 files on hard drive. And At first that was the only way to make a website. and then there were tools that came around especially Dreamweaver or sort of the stack that Dreamweaver came with FrontPage for the Windows world, it wasn't Contribute, it was...I forget the name of that the system that ran that sort of high-end Drupal I mean Dreamweaver kind of file management stuff.. So there's this whole sort of I don't know five years or something More and more complex tools to manage one to one file. So you got one webpage, one web file, one webpage, one web file. And then we moved into a world of like let's use a CMS let's store all this content in the database let's have robots running in the background and every single time a person types the URL if their going to myfabulouswebsite.com/about there's not actually a file there, there's a robot there waiting and when you, when the robot gets the URL then it says "Ah, I need to build this page for you" and it runs off to get some information from all the different places with the content out of the database typically more frequently gets a whole bunch of different pieces of content of different part of the database gets all the directions about what it's supposed to do and it builds the page and sends it to the person who requested it, the user. And some ways that's awesome because you can have all kinds of different interfaces for putting content into the database and you can have content changing constantly people can be making comments and all sorts of, you know always different the page gets built after the system knows who the user is like you go to Amazon.com and it knows I'm me and it says "here's oh you bought this book last week, maybe you want to buy this book," or "Oh you have a dog, so we want to show you this ad with the dog stuff" Like, dynamically generated per person. But then the bad side of that is you can melt the server if you have a lot of traffic. And if hundreds of thousands of people are hitting your website at the same time then the robot have to build that page hundreds of thousands of times per minute so we kind of hit in my rambling history of the evolution of web technology, like we kind of hit a limit there. And so a lot of people what they've been doing meanwhile is saying, "okay look, we need to cache things we need to not build this page every time user hits the site so let's set up memcache, I just had Josh Kerning*** come on the Web Ahead episode 26 and he talked about servers and all the different things that you might want to have, so that your webpage gets built once and then you have than 1000 people visit it, and then it gets built again and then a thousand people visit that version of it, and it gets built again so that you can handle that kinda traffic. But then what you guys are talking about I feel like is sort of a third thing, and it's not so much going back to the first thing, it's more taking, I think although I want to ask you, taking the best practices or learnings from the first two phases and sort of inventing a new system where the content is being stored in files and the files are getting built once and then hit as many millions of times as they want, but there's is still a complicated system to build those files, and the content can be separate from the structure of that content or the visual display of that content and you can have some of the best practices that we've developed where you keep things separate and you don't mix the way something is presented with the thing itself. Does that seem right to you?
Young
Yeah, I think that's pretty good description of what's going on. I'd say, so to be clear we are not the inventors of Jekyll we did not invent static site generation. These ideas have been around for long time. But we kinda stumbled across it as we were looking at github as a really useful tool for team collaboration and one of the things that Tom Preston Warner, one of founders of github, was doing was he was running his blog off of git. He was saying, "Oh I use git to manage my code and collaborate with other people on code." And for developers sort of like one of your main outputs of all your creative energy is code right so you think of a writer writing articles and posts. For coders, they may blog once in a while but the main creative output is code, so github developed all these incredible tools around writing code. Right you can comment on specific lines of code great issues to track your actions and milestones. Whenever you change when your files and get you can comment and annotate and rollback or branch your code to start testing out a completely new idea on your code. And so what he did was he said "Well, I wanna blog sometimes and I'm finding that sort of I kind of want to do the same sorts of things around my blog that I do on my code I want to test a new format, or like a new style of writing maybe I want to create a branch and just experiment with that and my branch. Or if I want to highlight some important lines in my introduction and sort of comment on that and let someone else sort of give me feedback on that I want to be able to do that. So I think there's been a shift over the past couple years not only has there've been a lot of adoption of git as a version control system, but people are taking a lot of inspiration around the tools that github built around git for not just code management just in general but also as a platform for collaborating around content.
Jen
So will you explain perhaps what a simple setup is with Jekyll for a blog for instance for a person who does know code and ruby and git. what are people typically doing, like how does it actually work
Young
Sure so I'll just start for the ground up. What you do is you create a post directory crate folder and you put the sort of content you want in your blog there so each blog post that you want to write would be a single file and in that file you can really use whatever format of writing you prefer. Most of the time I think people prefer markdown. So you write your post in state markdown and at the top of the file there's a section where you can basically add some metadate to your post and it's basically in YAML format which is a pretty easy to write by hand format. So you'd write like title: and the name of your posts. You could give it a different date than say the timestamp on the file and so on. So you can add some metadata to your posts, but basically our file becomes sort of this combination of content and then metadata at the top. So in your folder you may have some ten or so posts and at that point that content is sort of devoid of any layout or visual--you haven't really made decisions about your CSS or what your site will look like. Then with the rest of your site folder you can create CSS files you can add your images or whatever assets you want to load and so then the last piece is the layouts folder that you create and what Jekyll will do is it will go into the layouts folder and it can basically use any of those layouts to basically wrap around your blog posts. There's some additional sort of interesting pieces to Jekyll, but this is the core idea that your content is split out into individual files which contain content metadata and then you have a collection of layouts and other tools that describe patterns for putting together a webpage.
Jen
So how much is this sort of like using Dreamweaver back in the day or I just discovered recently that code kit, a tool that I use constantly has its own kit language, kit files system where I can, it's almost like using PHP includes or something where instead of building one HTML file for each page I'm building sort of one file that's the top of the HTML has got all the header information for the HTML file. One file that's footer, and then the middle is the content then I can make a whole bunch of middles and code kit or it or used to Dreamweaver and Dreamweaver libraries you would hit save and in one place in the file it would say, "Oh you need to suck in the contents of this other file places here," and it would generate. Is it that or it feels like in a way more superpowers are needed
Young
I say it'll feel very familiar to people who have been using PHP for templating have been around the web enough to go all the way back to Cold Fusion. The sort of base concept around templating, includes, loops within those files, referencing other assets, these are all very familiar constructs that you're going to find in Jekyll another static site generators. There's nothing crazy different about it. I'd say the main difference between working on sort of a modern CMS type of website and working on a statically generated website is what you explained, what you're building for is the content and the final output of the website you're not trying to cater to individual requests from users coming in dynamically against the website.
Jen
So it also feels like there's a bit more in the Jekyll world of consciousness around that that file where the blog is saved is just the content with the bare minimum formatting typically written marked up using markdown, but you're basically saying, "This is gonna be italics, this is the headline, here's the paragraph." You're not saying, "Here's the blue sidebar with the green letters." And that may be a little, it seems like those files are very very...like there's nothing in there but the blog post there's none of the rest of HTML that will end up there in the end. Does that seem right to you that there is a consciousness around web standards and the need to keep things separated in the Jekyll kind of world that's getting created these days?
Dave
Yeah the content stored in the Jekyll posts is very similar to the content you would put in the database of the CMS where if it's Drupal you're creating a node that has certain number fields for metadata and then a body content element that that's basically the markup that you're talking about the structure of the individual content post and that's stored separated from the actual layout you're going to present the content in. Jekyll preserves that kind of templating and as Young was talking about there's different layers of templating too, so a post can immediately go into a blog post template that might have a sidebar, and then that can go into a default template that would add the site-wide navigation on top. It keeps everything separate and actually one of the benefits of this too is that it's very easy to iterate on design you can create new layouts and changes just as quickly as great new HTML template file and then you can test it out see if that's what you want. There isn't really any server-side code configuration or additional models you need to plug in to iterate. And this is again really useful for the project I mentioned previously healthcare.gov when you're working on responsive templates and you need to test across many different platforms and devices and screen sizes.
Jen
So then what about landing pages. It's kind of easy to imagine for a simple blog where each content file is a page, but what about, one of the things that I've found really powerful about using a big CMS is the ability to quickly, those that can do it, the ability to create separate lists and little blocks or modules of content that have teasers or thumbnails or you know more, "if you like this thing maybe like these other things to" and being able to grab content out of multiple out of the database multiple times, and you've got the canonical page that's got the mail articled, but then you've got all the landing pages and all sort of teasers and previews of content all over the place how, how can somebody do that using the system Jekyll that something that's possible?
Young
Yeah this is where Jekyll gets really interesting. So in addition to taking your content and rendering it in a layout producing HTML files Jekyll actually manages in memory all the content for your site so what this means is you can create a post and call it index.html and you can iterate over every single post in your site so this is how you can iterate over posts sorted by reverse chronology and get the last 10 posts of your blog for instance. You can apply conditionals and say, "I want to post that are tagged with blog as the type, or posted have hero as the tag on them and those will be the feature post that we want to just have on the homepage." So there's a templating engine built into Jekyll and it uses something called Liquid which is basically series of tags that you put in your HTML and you can do for loops, if statements there's a number of filters so you can change the case or manipulate strings, those kinds of things that you would do in PHP are all available to you in the templates. Again the real key here, even beyond the fact that the content is not be stored in the database is that you do all the processing for your site when you're actually writing content not when the content is being served. Everything is a write-based operation instead of read. This means once your website is generated you can publish it into any simple server, you can put it on FTP that has web accessible front end, or you can put it up in the CDN directly and there's just no server-side application that has to keep up with all the traffic. There's nothing to scale when your server gets really popular except for just basically serving the simplest form of a HTML and CSS related web asset files.
Jen
Yeah I mean that's one of the main advantaged is this scalability and just kind of built into the system is the way it's gonna go, it's gonna stand up to tons of traffic unlike a big CMS with a database.
Dave
And the setup becomes really simple to as you have described to have a directory and in the directory you have folders and files and that your site. There’s no application to set up there’s no database to configure. If you want to move site-to-site you can just hand a folder over to somebody or you can check it out from git as we usually do. And how this change the workflow it’s very similar for those who are familiar with git with how you develop actual code project whereby everyone on our team when they want to add a blog post just checks out the copy of the entire site works on the blog posts and then checks it back into git and that regenerates the site for you. Git actually has Jekyll built in so you can just check in those raw template files and content files, and github will automatically produce the HTML and serve that publically. So it's a very simple workflow the dependencies of very simple. You basically have ruby installed and a couple gems that are extensions of Ruby for running the site.
Jen
That’s one thing that is a bit fascinating and a bit confusing, I think, for wrapping your head around it start with. Dreamweaver or Code Kit or whatever, building webpages locally sort of the old school back in 1997 using Dreamweaver some similar tool using includes. Those pages got built locally so when you’re done you had those pages right there on your own particular computer and then you pushed them. So it was, you needed to be working in a development environment you needed to be in Dreamweaver or in whatever tool you were using. And that seemed really awesome at first. There many years where we all thought that the most awesome thing ever and when Tim Berners-Lee first invented the web the first web browser had an editor in it like the idea was you weren't just going to read, you were going to make and everyone was going to make a website. Everyone was going to learn Dreamweaver. Everybody was gonna learn how to use these tools. And it sort of not really worked out that way. So one of the things I’ve been seeing when people build Jekyll sites is the scripts that run, because there are robots that run around, when the robots run around and build the pages, they’re not doing so on each page request, but they're doing it when new content is created. Those might be local but they might be up on your server, they might be, your saying, up on github because github has those running the ability to compile Jekyll files or files using Jekyll on github and on the github pages platform so talk about what that is and what it means. What if you're not using github at all you're just using your own git, you got your own local development environment but you want you want everything to be compiled on server-side so you run Ruby on your server and you install Jekyll and you just have it run. How does that work?
Dave
Yeah, so it's just a program that you run, you install it as a binary file just like you would install any other program that you're just going to execute on your server. You really don't need to know anything about Ruby to be running Jekyll which is actually kind of nice because people can just pick up and start blogging on the project that we put together just by learning markdown. That's the lowest barrier to entry to write posts, then if you want to edit the templates then you learn HTML structures and liquid which is the formatting engine for the HTML. So to set up Jekyll you would just install Ruby as the development environment you'd install Jekyll as the gem and then just run the command line just jekyll build or jekyll serve depending on what you're trying to do with it. There isn't that need to write server-side code because all Jekyll's going to do is read through your directories, get your input files, your content files, your template files and then generate them into the output directory which is the HTML and other related assets that you want to just copy on your web server so as you mentioned with the github workflow this just means you push all those input files directly up to the github, the github automatically generated the output for you by running Jekyll on their servers. You can do this locally and just copy your files yup to wherever you'd like to serve them. And we also put together an open-source project called Jekyll Hook which we released a couple weeks ago as part of the healthcare.gov project and it's just a node.js server that you can have running, we run it for instance on an EC2 and it's just running all the time, and any time there's a change in github, this server is notified, it runs Jekyll for you, and then it just executes a shell script to move the content wherever you want. Though in our case that's using S3 CMD to copy over to Amazon S3 and other projects that could be using rsync to move to a different server.
Jen
So it sounds like there is a little bit of like, you know you gotta to set up your servers in a certain way or use a tool that's already setting it up for you using running [24:50 inaudible]. But it doesn't sound any harder than setting up a server for a CMS.
Dave
yeah I think we should really focus on the role that github has played in enabling this technology for I would say 99% of the projects we've do over the past two years all we do is design the site and we check into a github repository, and then we turn over access to that repository to the client or the partner that we're working with. And it's that simple. Github automatically runs Jekyll for updates. We don’t really have any long-term maintenance for running the servers, github's providing that is part of the service. You get that with the basic plan that you get with github as a base level service. Where this gets more interesting is when you want to actually have content editors create new content edit existing content without needing to check out the code locally to their computer. Like a web interface that you get through a CMS when you can just go to /admin on your website create your content. And for that we put together another open-source project that actually is itself self hosted on github. It’s called Prose. It's basically a content editing interface for the posts in github. It just uses github's API and it allows you to view all the files in your site, click on one, open it up, create whatever change you want, and then take a preview, if you like, and then create a commitment, and that just saves directly into github. So the whole workflow can be decentralized and if you really kind of the embrace the culture and the workflow github set up, it becomes very easy to get started I think much lower barrier entry than a lot of traditional setups for CMSs.
Jen
So I want to talk extensively about content editors and non-nerds adding content to sites like this but first just want to...it sounds like liquid is really like that’s really where you have to start learning perhaps to do a simple blog or something it's going to be very easy but to do a more complicated site that’s got landing pages that are designed a specific way, whoever builds it is going to need to know liquid and get into what is and is not possible using liquid.
Young
Yeah, I think that’s right but the lowest common denominator for Jekyll site is actually quite interesting. The way I started using Jekyll was a couple years ago I was doing prototypes for the redesign of our website and at the time I didn’t have any opinions about what sort of system we were going to use, so I was use sketching things out in HTML and CSS flat files. And at a certain point I had some prototypes and people were ready start developing on it, and we talked about looking at Jekyll as one of the options for powering the site and it turns out that static site an index.html file landing page for your team page or your blog or a magazine style landing page, these are all, like if you start writing Jekyll in a folder that has these sorts of files you have a legitimate Jekyll site on your hands. There is no need to use any of the templating in Jekyll, there’s no need using the post or the layouts. All Jekyll will do in the case were you have these files is say, "Oh you've got a static site already, here it is I'll hand it back to you." And then the first thing you may do in one of your landing pages is say, "Okay I’ve got this landing page mocked up that has these five top blog posts or five recent blog posts or five featured pieces of content and right now I just have to sort of copy and paste lorem ipsum into these spots. Wouldn't it be nice if I could in the spot have five post that actually connected to my content directory. And in that little spot right in your HTML file you drop in a liquid for loop that looks for those posts, you've got that content dynamically coming from your post directory. So the really nice thing about Jekyll is that it’s very friendly for people who love web standards because if you learned HTML and CSS and JavaScript the right way and have been writing that content by hand for prototypes or mockups, you're already there, right? The most important part of your site is done and you can slowly and piece by piece start to insert the sort of loops and includes as you want and you don’t have to sort commit to a full switch over if you want to put it that way. So that’s how I actually got started and that was actually one of the most surprising things for me was that coming from a CMS background where you have to commit to learning the whole stack from top to bottom that wasn't the case at all. The first tag I learned in liquid was a for loop and I used it exactly for one thing and everything else I kept prototyping in flat HTML, CSS, JavaScript and if the project didn’t need any more complexity than I probably would’ve stopped right there.
Jen
Yeah, it's interesting prototyping, I've been doing a lot more design prototypes in code, and it does...I have flashbacks the 90s and I start realizing this and the tools that I used heavily in the 90s, I want them back because here I am and I've got three pages of prototypes, whatever and I make a change on one of them I don't mind cutting and pasting and making the change across all three, like, "Oh we made a change the footer, I gotta make the change to the footer in three places." But then the next thing I know I’ve got 12 pages of prototypes and man making sure that it’s the same footer all the way across cutting and pasting like this is getting to be a real waste of time. Or they want to make one change the navigation in the prototype and suddenly I can cut-and-paste it across the whole pile of files. That’s where I've been getting out kit and Code Kit, because I just need to, but it feels like Jekyll fill in that the same kind of need and being able to extract out some of the details of the site and have things in one file instead of multiple files so that you don't have to mistakes and keep making changes. These tools might be something people want to use even if they're only going to use them for prototypes.
Dave
Yeah, that's a great way to get started and that’s exactly what Jekyll is enabling you to do. The nice thing about using Jekyll or another static generator is this concept that you're starting with the absolute least amount of work that you need to do. You're not configuring and setting up and installing a PHP application you can use for your prototype, and you're not cutting and pasting things back and forth that could get lost or out of sync. You're just taking the things that need to be reused and putting them in a folder called includes, and you're taking the things that will format the layout and putting them put another folder called layout. You're taking the content posts and putting them in a folder called posts. Then you just run a program that knows what to do with all those individual files to assemble your site for you.
Jen
In some ways reminds me of SASS. For all the people who've made the switch from writing raw CSS in a regular CSS file to using something like sass where the hardest thing is thing is just switching your brain and setting up slightly different development environment but you don't actually have to write anything in, any sort of special sass format, but then you can use it a little bit here in a little bit there, and then the next thing you know it’s six months later and you're using it all over the place and you can’t remember how you ever didn’t use it.
Dave
Right and you usually wouldn’t actually serve those sass files to the browser, right, you would convert them back to CSS and that's the role Jekyll's playing
Jen
Yeah it's literally the same in that you're running a Ruby scripts that looks for your SASS files, sees that there are changes and compiles them into CSS files for you, and if you want to change the production format, you know what kind of format are you using, "oh, we want to switch from to development to production code," you make one change the config file then boom, you've got all your CSS re-written in production format instead of in development format. And in that way I think it’s very similar in that it’s a script that runs and looks for certain files and when it sees a presence of those files it creates other files. And those other files are files that humans don’t touch, the humans don't ever edit the CSS files. I assume the rule is humans don’t ever edit the results of Jekyll running the HTML files you only edit the source files for the site
Dave
Yeah exactly because if you do edit those HTML output files the next time you regenerate your site change are going to get overwritten
Jen
Yeah.
Dave
There’s an essay question in chat about formats, so this again why you keep the content separated. You can write your content in markdown or textile or some very lightweight format that’s really easy for people to learn. But if you want to make a completely custom layout, and we do this sometimes for a blog if we have a feature or product release or something and it needs to just look really good, you can just write the post completely in HTML and then you're taking HTML as an input and HTML as an output. The differences you still get the templating and the extra layered layouts that you get if you’re using markdown or another format
Jen
Are you also able to have Jekyll make, I think you said this before but, different outputs. So you have content, and on the one hand you're sending it to the website, but on the other hand you're sending it to iOS application through an API, and you're also sending it to an RSS feed and you're also sending it to some other kind of API that you have that other websites that are related to you are pulling your content in from that API. Is that something that Jekyll is happy to do?
Young
Yeah so the other fun thing about Jekyll is that it's not at all opinionated about your output format. So for example the output format is pretty much undetermined by Jekyll. If you decide to have a file that’s called .html and it happens have HMLT in it, it's happy to template things into that. But if you decide to name the file .css and you actually want to use the content of your posts to influence what your CSS looks like and want to use liquid logic inside of there to influence that, you can do that as well. So we've kind of embraced aspect of Jekyll pretty aggressively so we have a variety of projects where the main output is JSON or the main outputted maybe JSON plus RSS feed alongside of it. So it’s quite flexible in that regard, and I think the sort of way to think of Jekyll sort of big picture is that it’s a converter, it's a compiler from the patterns that you've decided that are important to capture in your content and in your layouts, and it’s the converter and compiler to turn that into the final product that you’re going to deliver to your users or to the applications that are integrating with your API or website or something like that.
Dave
Yeah, let’s just throw out some real quick examples of that because I think it’s a really neat way to build here. So for instance, if you want to create an RSS feed for your blog, just as you use the posts list to iterate through and create HTML layout for your posts, if instead of wrapping those post in HTML, you wrap them in XML structure of RSS and give it a .rss extension, then you're going to have an RSS feed ready to go. As Young mention you can also create a JSON API, so for instance on the healthcare site, every single post that's available in HTML is also available in JSON, so we just write a simple template that takes the values that you inserted in the HTML template, switches it over to HTML, so it's really easy to enable these other formats just by coming up with creative ways to use the templates.
Jen
Was that the main motivation for Development Seed to switch away from Drupal to something like this?
Young
I'd say there were a lot of reason that we moved off of Drupal, and I think, and we didn’t really know all those reasons were the time but Jekyll has definitely been sort of an eye-opening experience for us and our team and it continues to be even two or three years later now especially this summer even I've been really impressed with, we have a lot of new people on our team that don’t necessarily have a technical background but one of the things that people are to learn when they come here is a basic HTML and CSS and just how to blog and how to write well and use github and collaborate with other people and the barrier to entry is just so low for someone to come in and pull together a new site or a really strong blog post; they don't need to know SQL they don’t need to learn a new system. They're going to learn through the core web technologies and that's going to take them really far on these sorts of systems
Dave
Yeah, if we zoom up to a little bit higher level and just try think about the role that this is playing, it's really a philosophical shift more than a technology shift. When we first moved off of Jekyll or sorry, when we first moved off of Drupal, there’re two kinds of development tracks that happened. One we started building really custom data heavy visualization tools that used node.js servers to generate the HTML output. And that was that was kind of a direct predecessor to what we were doing in Drupal, where we were creating these data heavy Drupal sites and just not being able to make them scale, or be as fast and responsive as we like because there so much configuration and other things going on in a Drupal site that just wasn't native for hosting data if you have a million records and points are going to put on table or map it doesn’t need to be translated to a node in a content database. So the idea of creating these node applications node.js applications seem to make a lot of sense. On the other hand we just needed to have really lightweight pages that sometimes that maybe the feature was just a full screen map visualization and you just need and HTML template to wrap around that with some about text and a ledged and some other information. That was just done in HTML. So I remember actually one of the first projects Young works on when I just started here was a map on disaster relief projects after the earthquake in Japan in 2011. And it was just an HTML site with some embed pulling from some external APIs and it was hosted on github. So we kind of went back to the basics of just creating plain HTML and CSS by hand, and as these projects started to scale, we went in two different directions. The node work became kind of the core of the Mapbox application, that is still a server side application, but with these kind of custom servers, server applications, you're generating APIs, mostly JSON APIs or other data feeds that front end applications consume, but for core site building, it's back to the basics, just create HTML, and then when you need to scale and create a few more pages of HTML and CSS instead of just one, use the lightest weight thing you can to create to create those template. So in that case it's a converter like Jekyll where you just can scale up the content without having to replicate your templates and your layouts over the place. So this was a paradigm shift as much as was a technology shift. It wasn’t the case that we decided we were done with Drupal and wanted to start building Jekyll sites, it was we were done with Drupal because we couldn’t do the data visualization at the scale we needed, it required a lot of overhead for me to [40:07 inaudible] configuration, and it was going the direction of a much more all inclusive project than our work was taking us. And at the same time if we just want to build a quick website you know even maybe it's a wedding website, or a personal website for friend you know you don’t need to encumber them with an entire CMS to maintain that, so we started to go back to basics there and I think what we’ve been able to do now looking forward is create a tool chai, a stack of services and things that are out there for people to use this methodology of creating simple some static sites to the point were you can get it at scale of the size of some of the Drupal sites we were doing before. And there's just no way that we could have built healthcare.org in Drupal that would've been as fast as the way it is on static HTML. Because at the end of the day, even if you've cached everything, you're generating and returning static HTML. So it's the exact same thing, we're just doing it with a lot less overhead. There's no server cost to worry about if we have to scale up. So the tools like that as we mentioned Prose for content editing, Jekyll hook, this server that you can internalize if you don't want to use github's built in Jekyll server, and then a bunch of little plug-in that add features to Jekyll that kind of get that dynamic element back. That's the thing that you really will have to think differently about. It easy to plug a module into a traditional CMS and get a feature. Jekyll doesn't usually have that kind of framework. You have a way to generate sites, and if you want additional features you either template that in your liquid code, or you rely on JavaScript a lot more. We're at a confluence of events here where github is a really successful workflow, APIs are more prevalent for major web services for embedding images from Flickr, video from YouTube and JavaScript in the browser is just incredibly fast in most modern browsers and can be made to work even in older ones as well, so that's kind of what brought us this point. It's not exactly a one-for-one switch back to basics, really spent some time the wilderness to think though what we were doing, and try to come up with the right paradigm that takes us into the future here in an era of multiple formats of publishing, responsive design, and really just having dynamic, fast JavaScript based interfaces.

Jen
So all the sounds completely awesome, yes let's use git for everything blah blah blah oh except I'm building a website for people to add content to and they are not nerds. They don't have a code editor they don't have git installed on their computer and they don’t want git installed on their computer. They don't even want to learn a GUI to use git. They just want to go to a website, login, edit or add content, hit save. That alone is hard enough to do that job well is enough. So part of what’s a little bit like [cat noise] about this ideas of getting rid of the database is, well doesn't that also mean that you’re getting rid of all of those awesome tools or well, we wish they were awesome, actually they're not as awesome as we'd like them to be, but maybe someday they'll be awesome tools to add content to website. So you have this, so tell us more about Prose and about how you’ve approached, I mean I can only imagine if you’re building giant websites like healthcare.gov, you’re not assuming that every person in the federal government who is working on that website and adding content is gonna then also install a command line on their computer. So how are you approaching that problem of getting regular people to be able to easily add or edit content.
Dave
I don't think we found WYSIWYG that we liked, or a CMS admin that was quite fast enough or responsive enough and intuitive enough to really rest on where CMS has left us. So this was another thing we wanted to start from the beginning of and rethink. What makes this really easy if you have your code and your content in git in the same repository is that using a service like github you have that available to you as a web API. So you can create updates or changes or new files or delete files all the administrative tasks that a CMS would do on a database you can just do right over a RESTful API. This means we can build a lightweight front-end application in the case of Prose we built an HTML and JavaScript application that's completely client-side, there's no server except for authentication involved and all it does is give you the content in github exposed in a really slick clean editing interface and just allow you to make the changes that you want. What's really neat about it is in the course of the healthcare project we spent two people full-time on this project working on Prose making interface really easy-to-use. We would never be able to do that on a Drupal site, where you'd spend a substantial part of your budget for your project just on the editing interface to make it really good for content editors. For the people who are making and maintaining your site are you using that interface every day, so does need the attention. So again going back to the basics rethinking editing is really important to try to figure out how to create a tool chain that's gonna enable just anybody who wants to produce good content can have the right interface for that.
Young
yeah just to geek out here a little bit Prose is really interesting in terms of talking about sort of the confluence of a bunch of new technologies that make this possible. Where Prose if you're talking about static site generation as something that has been around for ten years or more since the beginning of the web, Prose is definitely the kind of technology that can only exist now right so it combines a bunch of really interesting things that even a couple years ago just weren't pervasive enough. For example Dave talked about having an API to be able to do sort of CRUD operations against your content from github. The prevalence of REST APIs is a new thing. The ubiquity of Oauth was quite surprising and a really important development on the web. So for a quick background, OAuth is a spec that basically defines how one website or one application can ask another application whether the user who is using the application should have permission to edit content on the another website. So it's basically it's like if I know the user uses Twitter, I can ask Twitter whether the user should have permission to use their own Twitter account and Twitter will allow that user to authenticate without me really creating a user account for that user or anything like that. So basically Twitter becomes or authenticates the user on their behalf. So OAuth APIs and finally client-side JavaScript has just really become incredible. Prose sort of comes out of our background of working with...we were a team that very early on went backbone.js, like 0.1, the very first release from Document Cloud team's work came out. It was something we want to look at and being able to work with backbone as a way to...you know like if you've ever done client side JavaScript on a level where you want to do an application you want to open documents, and maintain state, and do undo, and you want to save things, and you want to browse lists of documents, you want to manage all that data and keep it in sync with the UI, if you were trying to do this before the new MVC frameworks that are available now, you had a mess of jQuery and custom code just like spaghettied out all over your code, unless you were super guru about how you organize things. And people like Jeremy Ashkenas who wrote backbone really took some of, as a super genius, he took some of the things he had learned in the past and really package those up as a common API for people you want to do this kind of work. So backbone plays a really critical part in making Prose sort of a feasible project for us. These sorts of technologies really make it possible for us to put this final touch on the static site paradigm which is that not only can we generate the site fast from a github repo, but we can expose that github API back to you through another interface that's very tailored to a certain audience or certain use case.
Jen
And Prose, anyone who has a github account to go to Prose.io and use the Prose interface to edit their files that are in there github repo?
Dave
yeah absolutely. Prose itself is just a static website. That's the beauty of it. So it's hosted right on github and prose.io is just a custom domain pointing at that when you go to prose.io, you'll authenticate using your github account through OAuth, and you’ll just have access to whatever data and repositories that your github account is already authorized to use. Any changes go right back into github through ajax request so it's very lightweight and just piggybacks on all the infrastructure that github puts in place.
Jen
So how has it been working with some of the clients that you're delivering sites like these to, and getting, and training them in how to add content to the site or edit. Have you found they really need to have a certain level of technical skills or how's that been going?
Dave
I think it's actually a lower barrier to entry than a CMS because mixed in with content editing, there aren't these other administrative functions that usually come with a CMS interface. It's just about content editing. If you want to administer the site, then frankly you get into the code and I think even when we were working large-scale sites in Drupal, that was always what would happen anyway. There wasn't really this idea, kind of myth that you could create an interface for non-developers to configure the site, at then end of the day you'd still need developers somewhere writing views code or customizing content formats. So Prose doesn't take that assumption, right, it takes assumption that if you're using this interface you're just there to create content and we have a full screen beautiful text editor. And we have the basic buttons on the top for markdown formatting so you can insert a header, or a link, or an image or lists, similar to what you get with WYSIWYG but it's not actually trying to abstract out generated HTML for you. It's just inserting the markdown syntax. So you learn markdown and that's really the barrier entry to using Prose. And we've found this actually works really well. So in a traditional kind of big site project, a lot of the content's being generated by a group of people and it's in word documents with track changes, and they're putting inline word formatting, like bolds and links into the content, and then you try to take that out of word documents and clean it up and get it into HTML, and you've got probably a week or two project depending on how much content there is just for that cleanup phase. So what we did in our most recent project is actually encourage the content writers to write their content in markdown. They were still using Word documents but now they're passing around word documents in markdown and they're not using any word formatting, they're basically just using word as a way to do plaintext and track changes so that when they accept all track changes, the output is a markdown document and we can just save that as a plaintext file and copy it right into the Jekyll site so content migration becomes incredibly easy. Now they're also trained up with the basis skillset that they need for editing markdown to maintain the site going forward through Prose. Markdown is a really another great innovation that simplifies the way we think about content structures and getting us off the idea of trying to come up with a WYSIWYG that abstracts too much. If there's ever a case we need more custom or complete layout than what you can accomplish with markdown, you can always mix in HTML with your markdown too. And that's where your designer or developer will get into the content and create a little bit of an extra formatting.
Jen
Yeah, well this is...Karen McGrane has been, all last year she was giving this really great talk about content and the importance of creating structured content for companies to really...media companies and other projects any project to really think about their content not as far as one giant blob, but in very structured chucks so that you're asking for a package of assets, you're asking for a package of descriptions and of metadata. And if someone's adding an article to the system they're not just adding the main body text of the article, but they're also adding perhaps the short teaser, and a long teaser and a hero image and two thumbnails and this and that the other. And that the content entry system whatever that is really should be more of a form that people fill out that has a bunch of fields and you fill the fields out, and the date is stored as a date field, and this particular thing is stored as what it is, and I've been kind of working the last several years a way to say, "you're not allowed to make up layouts and sort of drop a bunch of a HTML into the middle of a generic form field and kind of do special and unique floats, and stuff like pages. Like, we're designing a system here and if you need a certain kind of new layout, then let's make that, let's adjust the system so that there's another version another possibility in the whole system. And meanwhile all the content that goes in goes in clean goes in without any, maybe has a little of it HTML, maybe a little bit of markdown, or a little bit of formatting, but it's more like, "Oh, the middle this paragraph there's a book title, and the book title should italicized, so I'm going to italicize it," not, "Oh, I think this paragraph would look good blue, so I'm going to make this font blue."
Dave
Yeah I think the diagnosis of the problems right on, I think the medicine being prescribed here's a little too heavy. It's actually not practical to take all the content on a site, any site of really any size and assume that you can ever come up with a form that's going to match it all. What works really well here when you have your content in markdown and then you have kind of this free structure of metadata, the YAML content that you put the top, one post could have six attributes and another could have two, and another post can have a completely different six, and it's up to you to write your layouts to know how to handle that. Look at the most successful web projects we've seen in content heavy industries like journalism. Look at the snowfall avalanche reporting piece that came out from the New York Times about a year ago now, actually about six months ago. And it's incredibly custom presentation of really media rich content heavy visuals and it's very compelling and it works really well and part of the reason is that it's unlike anything most people have ever seen. And you could never really accomplish that by assuming that you're going to just have your snowfall template in your CMS. The reason is your Snowfall template is unique to that particular piece of reporting. And it's the same thing for the Matchbox.com product blog. When we release a really important feature tile mill, we want to create a visual that's going to tell people and immerse them in what that feature is doing, not necessarily try to fit that into a structure that we've come up with. And if anybody's ever tried to, especially in a big enterprise environment, get their CMS modified to add a new layout or a new template, they know that another going back to the system, and trying to change the system every time is inherently difficult and stifles the agile and rapid kind of development and design that we're moving towards. So having text files like this, where you can just totally customize a particular post and make it look like exactly what you want without impacting anything else in the system is very flexible way to try manage your content.
Jen
That's what I find fascinating about this is that tension between those two ideas. Because on the one hand there's something really correct about saying you know you're building a CMS, you need to build content types, you need to build structure content for those content types, and you want to have a bunch of fields, and you want to make sure that you got 15 different editors or authors that they're all going to fill out the form properly so that when you have 3 million records, that all 3 million records all have a very similar data structure, and you can later dump CMS and go to a new CMS, or you can switch from one this website to this website or you can create a new API or a new JSON feed, or whatever out of structured, similar, 3 million records.
Dave
Yeah it's just not a problem anymore when you can create new post with such a custom layout so easily or when you can create a completely new site just as easily. Again, you copy your directory of a site that you working on, give it a new name you created a completely different site and you can just make that one change that you want so doing this through version control like git and having text files as your main interface, not databases and applications, it lets you get at that core goal of having really flexible content presentations and you can have structure in each of those different sites, but your ability to make changes and make custom experience is far greater than what you do by trying to change system that's got to manage different types of content one place.
Jen
Yeah, yeah, I mean that's the, that's the...I feel like it's a big wait-and-see like, because what we've seen in the past and I've worked on quite a few to university websites, so I come in and I see you like, "Okay, where you at now and when we need to go." And I'd see all the different problems that they were encountering, and all those legacy pages and pages and pages and I think it especially happened in universities because you'd have different apartments, it was the computer science department who originally built the first website and eventually convinced the university that they should have one, and then you end up with 14 different departments and they each have their own website and everything's handwritten in a different system and sometimes a different CMS. And somehow coming in and somehow wrangling all that chaos and building something kind of standard out of that is very helpful and very hard, very hard. But simultaneously, when you build structure like that you lose the flexibility, and you no longer have the ability to do something interesting or innovative or agile you, can't iterate quickly because you've built a structure, and the structure is intentionally built to prevent chaos and to force everybody into a system, then once you're in the system you're stuck in the system. So on the one hand the argument is make a really good system. Build it well, design it well, plan it well. And then on the other hand what I'm hearing you guys articulate is this need to say like, let's just get away from all that legacy and all that cruft and craziness and let's build something sleek and simple and modern.
Dave
Yeah, and it's not for lack of trying. If you look at Dev Seed's portfolio and the Drupal Space before we moved on to other things, it was all about creating flexible contexts and ways to organize content in a much better level, and trying to this balance of structure and as you say build a really good system. The reality is you can't actually build a system that actually going to be good enough to do what you think it needs to do and also anticipate the things that you haven't thought of yet. And by the time you build that system, your needs and demands have change so much that you're going to start over. And I get the analogy of the universities, I've solved very similar system when I was working in the government government previously, and we can all make whatever jokes we want to make about government IT because it's the example of large, catch-all IT bureaucracies that have very rigorous change management processes and it just doesn't work at the end of the day. The savior here though is flexible content formatting, getting your content into markedown is a standard that you can move [to]. Even Drupal or WordPress can understand markdown and also create good APIs. So again, for a government site like healthcare.gov, it was very important have all the content exposed in a JSON API. If that becomes merged with a couple other policy specific websites and put into one umbrella, we don't have to rebuild all the different sites, we can just tap into the known APIs formats. So the data is much more liberated and would've been by trying to build perfect system.
Young
Yeah, I just wanted to add that again, I think Jen the way you put it, this tension is sort of, I think it's sort of one of those problems that I think keeps us really interested in this field, right? As web developers and designers, this tension I don't think is a really ever going to go away, and I don't want anyone to get the impression that Jekyll is some kind of magical potion, that's going to solve this problem for you, but I think one thing that's important to keep in mind especially from my own experience, I remember that you know you want To delve into this question as deeply and as seriously as you can right, like what's the right balance between structuring content for the creators and being able to let them express themselves when they need to make a big impact, or put shape to their idea in a way that what your system previously thought it was good to do couldn't. And if you want to deal with this problem in the most direct fashion, what you don't want to have happen as you're dealing with the problem, you find out that the reason you can't make a certain decision or have a certain opinion about this problem is because you're using SQL and there's only so many columns you got when you start this project and that's the end of the question the end of the big philosophical discussion that you were having internally as a team about how much flexibility to give ends because your database technology cannot allow that. Right, like that's the wrong place you want to end up as a team or as a developer is to answer that sort of question with this very mundane and terrible sort of physical answer which is sort of like, "You know what, SQL's not gonna do the trick," or the system you're using doesn't give you the chance to answer the question in a different way. For me what I think we've been talking a lot about Jekyll and markdown and sort of the particulars of the system, I think one of the really exciting things for me in this space is that actually if you go and actually use Jekyll, you'll probably be surprised by how little it does but how open leaves these decisions to you. And in one way it will be quite frightening for a lot of people because people are used to having this question answered for them by the system they're using and on the other hand you can create a Jekyll site that very very restrictive in terms of how the content is structured, and you can Jekyll site that the only metadata that's required is the title and everything else is fair game or something like that. So I guess I just want to highlight that this problem I think is super interesting and you want to deal with as sort of, at its most direct, in its more direct from. For me at least, being able to deal with it every time right in the face rather than sort of like running into SQL limitations or couch can only index documents by their key in alphabetical order or something like that you sorts of limitations are ones that when you want to deal with the hard problem you don't want to be coming up with answers that taste like that.
Jen
Yeah, and it seems like Jekyll will let you do whatever you want to do for good or evil. That you could end up building a really crazy chaotic mess that happens to be built with Jekyll or you could build something very elegant and well-planned and future friendly and kind of brilliantly thought through.
Young
Yeah that's exactly right. It's the same as sass or any of these technologies that are both going to allow you let you create your own patterns. But then that does not mean that you're going to be creating good patterns, right, you're still the driver seat.
Jen
Yeah, I mean that sometimes is a criticism of sass. People say, "Well, you can nest your CSS too many times, and you end up with too many selectors and it ends up really verbose and blah blah blah that's bad that's bad." Well yeah you can do that in sass, but you don't have to. That's not like, sass didn't, it's not sass's fault if that's what you do with sass. That's just bad development technique. The...oh I forgot what I was thinking...Oh also I feel like we've been saying Jekyll Jekyll Jekyll, but I've seen this separate from Jekyll, I've seen this same bigger trend and that's really what I want to get at today is this bigger trend of leaving behind the database and figuring out ways to put the content back into the files, but having that content be in separate files, files that are basically nothing but content which is to me new and not how we did in 1997. Where the file is just literally the blog post with a little light formatting and metadata as you described. That it might be Jekyll, it might be something else. I've seen other people building other tools that are, do the same kind of thing but a very...it's not running Ruby, it's running something else. But that the shift from the database into content in files and the questions that it raises around editing workflow, best practices for a company that ends up having a lot of content and a lot of data store, how to make sure that that's going to work for the future, how to impose a structure so that it is us useful in the future. Those questions I think are to me, it sort of ends up blowing open a whole new, "Well we thought we knew what the best practice was, we were pushing everybody to go this certain direction now maybe we don't even want to go that direction, maybe we want to do something completely different." What does it really mean to be storing content in files again, and how do we not screw that up?
Young
Yeah just To make a couple observations, like when we started working in this way you know of course internally there were a lot of questions were whether this would be a good idea, and since then we've seen a lot of, you know once a while we'll see post from other companies or other groups that really give us a lot of encouragement about this direction, right so for me some of the most important posts were male chimp when they redesigned their site, not their most current redesign, but a couple iterations back, they talked about how to switch off of an internal CMS and they've moved to a static site generator that they created themselves for a lot of the reason we talked about today. And then you know in the past election cycle, the Obama web team, they actually did all their fundraising based on a static site. And all of the payment systems were APIs with a static site in front, and that was, one of the sort of like mind blowing things in some of the blog posts about that was that they actually had many different payment providers because none of the single payment providers could handle the sort of traffic they were sending, but their static site was the single common denominator between all the payment providers. You know if payment provider A went down because of the traffic they're sending, they would just switch the static site to point at provider B and so on, they just cycled through them until A came back up. So to me, and then sort of like to talk about the big picture here, I think if we thought that the best practice was to put our content in sort of clean, structured format into a database like MySQL or couch or mongo, I think for us as a team, the new approach is actually not that different conceptually from that. For example, we would never do static sites if we didn't have git to be able to check in our code and have a history of that content and our code, and be able to go through the logs and be able to say, "this is when this file changed," or, "someone had a typo and they decided to change the first paragraph so that it was better written and so on." And the fact of the matter is, git is just a much better database for content than MySQL, and couch, or Mongo will ever be. If you've ever used git and even if you're not a command line kind of git user, but if you got to github or use some git UIs or even have a concept of how git's version control system works, it just so much more powerful and so much more intuitive for content than what these other databases provide, so in a sense, our content is still in a database, but that database is a lot better and it was created for a very different need.
Dave
And I think the real key to what we're talking about here, even more than thinking about where the content is stored, is when it's actually generated. Not having the content be generated every time the page is requested, or every time the cache is cleared on a requested page, this gives you much more freedom for the presentation. There's actually I think extensions to Jekyll where you could connect Jekyll to repos from MySQL database if you wanted, if that was something you were going for. But as Young said, that isn't the right way to go in most cases because of what git offers, as opposed to what MySQL offers you. But having the ability to generate your content ahead of time is very liberating.
Jen
So what are some of the limitations of that. I mean, have you built any websites that have comments using this technology?
Young
Yeah, the blog on the healthcare.gov site does, we used a twitter integration for comments on our Mapbox and Development Seed blogs. One of the points we were talking about earlier is about how APIs have become much more prevalent. I would think you're probably going to be pretty challenged either using out of the box commenting in a CMS or using your own thing to come up with a better commenting engine than something or Disqus or Intents seems to be because those are companies that have spent a ton of time and energy and capital on coming up with a really good commenting system, so dropping in an embed like that, as easy as dropping in a YouTube embed for a video, just like you wouldn't probably try to build all the infrastructure that YouTube built just to put a video on your site, makes it very easy to get on board with dynamic features. So you can think about a constellation of dynamic features that you'd want on your site and patch them in from other API services, and the nice thing about doing this is that instead of building them into your CMSs, if something in the code that writes your commenting system breaks, your entire website doesn't go down. If this where a PHP application though, that's likely what would happen. Just now your comments block will disappear. Same with search, right? You probably don't do search this way, you probably use a different application for that, and that can be something custom, or it could be a service like Google Site Search that's provided as an embed, and if that goes away, then you just lose that specific functionality, but your core site is still available. So joining these loosely joined pieces make for a more resilient and dynamic site and allow you to tap into a lot of the best practices. If you ever run into something that you need a really custom version of, maybe you do need to have a custom commenting engine that has your own technical rules. There's nothing stopping you from creating that, and doing it however you want, maybe you actually want to have a Drupal site that just produces content or comments, and you embed that back onto your blog. That gives you the ability to create custom applications and embed them. If you're us, for instance, again we wouldn't run our mapping company out of a Jekyll or a statically generated site that takes very custom software and finely tuned infrastructure to create a very scalable and custom mapping platform, but it has an API and it has an embed code, so there's a lot of points of entry to take the content from a service like that and put it back into the sites that we're building. So you can still build your own services, you can still build your own dynamic applications, the key is just to think of them as separate applications with a separate job than your main content heavy website.
Jen
Yeah, because there's always a danger of using a third party system. If I go and use Disqus as a commenting engine, first of all I don't really like the way anything looks, so I'd probably want to skin it all, and I don't know whether or not that'd be possible. Second of all, I find it to be very very slow, and loading from their servers and their JavaScript is slow, and third, when they go under, or they get bought out by a company that I have problems with, or something goes wrong and they disappear of the internet, which happens all the time, companies that we trust and love and give our data to, they disappear. There's always a little bit of a risk there. Not that that's, especially if you're thinking about a company website you're putting together a system for a client or something, then there's pros and cons to everything, there's risks to every choice.
Dave
Yeah the challenge then to look at then is would you want to maintain you own infrastructure and servers for commenting
Jen
Yeah
Dave
And if you would, then you do it exactly the way you would have done it in the past and just embed it. If you'd like to use something that you can embed, then that's an option, too. If your core business is commenting and you are a commenting service then you're probably going to build custom software for that anyway.
Jen
Yeah and I imagine, I wonder, I feel like we're going to see more and more tools to support this like Prose that you all made. Editorially is a system that's getting built for editing documents, you know it's a startup it's not really in public release yet, but it's really a very interesting set of tools for people to write text collaborate, edit, show each other their work, discuss it. All sort of behind the scenes before it gets published and that's one thing that some content management systems try to provide, is a way for the company who's running that website to write drafts of content and pass them around to each other and have a system for editing things before it gets published. And it'd be hard to write all that from scratch, but perhaps a system like Editorially could be used, it's all Markdown, I think it's all in git somewhere, and then just have that sort of get launched into the main site when it's ready to go. Github pages, it just feels like there's an ecosystem that's slowly starting to build. You're starting to see more and more pieces of the chain, and maybe somebody will come along with a really great commenting engine and a really great kind of user generated web 2.0 we're gonna have a chat room or a forum or a discussion group or some kind of way that people can contribute that would not be possible with a static site all by itself.
Dave
Yeah, exactly and if those are the services that you're looking to build, then you build them as independent services, not say that you're also going to build a CMS that's going to handle all these other different roles and also serve your content directly.
Jen
Yeah which is a trend we've seen before where it used to be you know people would try to sell the all-in-one perfect solution to be all things to all people sort of dissolving into yeah, that's not going to happen, let's have a bunch of different pieces and we'll all work together we'll all take the different pieces and build them in the way that we want. So I think it's going to be very interesting to see where this goes and to see what. You know, your blog, I mean, there's just article after article after article, I highly recommend people go check it out, about how you did this, about what tools you're using, what kind of, how things are going, what you've discovered, and I think as more people start to think about doing this this way. Cloud Four did this, I mean they have built a lot of sites on Drupal. Lyza Danger Garfield and Jason Grigsby and they re-launched their own website at cloudfour.com, and switched to Jekyll, and when Liza was on the show she talked a little bit about that about her reasons around why they did that and what they discovered as they were rebuilding their own site without a CMS. So, you know, I don't know what my point is, it just feels like that there's more, I feel like we're just at the very beginning of thinking about this and we're like, "What does it really mean to drop the database!" I mean, the database has been central to everything in the last 10, 12 years, it's the reason web 2.0 ever even happened, was database, having a database. What other kinds of things have you discovered? I mean what are some of the pain points or the things that haven't gone as well, or that maybe there are still rough edges around, things that aren't quite where you'd like them to be?
Young
Well, let's see, well, it's not like working with Jekyll is, I mean we're all cursing all day about various things about it, you know I'd say probably one of the trickiest parts of a Jekyll site is content validation, so there's ways for you to screw up posts and really there's no, there's no great layer right now that will warn you that you've got badly formatted YAML in your header, or you missed a quote in a YAML header or something like that. You'll find out after you try to build the site and it warns you that there's a syntax error, but there's nothing that's gonna stop you from pushing that content out to you git repo or publishing it on the web. But I'd say, I think going back to sorta the question of whether it is about databases and files, and say it's more like Dave was saying, it's really more about separating these two axes of what it means to serve content on the web. If you think of the traditional CMS, there's this aspect of it where when a request comes in you're asking your system to not only serve requests, which is sort of like one axis, but this other axis of like the amount of computing power and resources it takes to generate the content for that particular resource, and so if you put these two axes together, I think this is where things blow up, right, you take expensive stuff that, expensive content to generate and you multiply it by when you have millions or billions of users and multiply it by that axis and you're dead. And I think the sort of interesting thing for me as someone who's transitioned from building a lot of websites to working on sort of Mapbox's core API products, is that I would be very surprised that if you talked to anyone who builds large, scalable web services that doesn't understand the static site generation concept, even though they're not generating static websites. It would be like if you watch a YouTube and every user who wanted to watch your video, the servers would have to re-transcode the raw video into the format that they want to watch, right? You don't build scalable web services that way. And so by separating these two axes, you actually have a chance to control the CPU and resources you're going to spend on generating content. You've not separated that as a different part of your system from what it takes to serve individual requests to individual users and by doing that, that's how you scale a system like YouTube, that's how you scale a system like Twitter, or Facebook. I don't want to say that this is...I kinda want to say this is not a new thing at all, it's really that this is very old concept from the people who had to deal with the scale problem is finally trickling down to everyday web developers who actually are now facing the same problem. You know, if you are a great designer and you really do a great job on a small site that you weren't expecting a lot of traffic on, one tweet from one very popular person could blow up the traffic on your website and that's happening more and more these days and I think this is the web-scale problem is now something that I think more and more everyday developers and designers need to understand and have a grasp on.
Jen
Yeah I mean there really is a new emphasis on performance and it makes a huge difference. I mean, I travel a lot and hotel internet is some of the worst internet and you get reminded of what it's like for the majority of people around the world who have very slow internet connections and data caps who are very concerned about data and they don't want to download tons of stuff. Whether they're on a phone or a tablet or computer, sometimes it almost doesn't matter it's very easy to get into a situation where the network is very slow and very precious. I was in London last week and over and over again I would click on a tweet and it would, like I'd realize, "Oh yeah, this website's really slow," and I'd just immediately back out. It was like there's no way I'm gonna hit that website right now, where other people's websites were very fast and snappy. And most of the websites that were fast and snappy are coming from static files. You could tell, or I know because I know the person's site and how the built it. But it's just amazing how fast a website coming from static files really is.
Dave
Yeah, even with the dynamic system at the end of the chain, there's still the delivery of a static file to the browser because at the end that's all the browser can understand, so, it can only be as fast, if not a lot faster going this way.
Jen
Yeah, if you have to wait for the server to build the file then it's slow, it can be slow, sometimes it's extremely slow. And if the file's just sitting there waiting for the server....Now of course you can do that. You can do that with something like Pantheon and Drupal and Drupal might be very, very slow, but Pantheon is just basically cached everything and it's sitting there in memory waiting for you to come and ask for it, and it's not generating the page on every request, it's generating the page every once in a while.
Dave
Yeah, absolutely, and I think CNN or some other major publishers use wordpress.com for publishing their sites. Certainly those are scalable. This democratizes the scalability though. You don't need to have a memcache layer or somebody who's really good at setting up these different caching schemes available to build the site for you. You can just start with this approach, get the benefits of a simpler design pattern and flexibility, and then scalability comes as a bonus there as long as you're putting the files and hosting them on a reliable source. The nice thing again about github providing so much of the service as part of their default offering is that you can put it on github pages and that's going to scale really well. We've hosted very popular sites right off of github pages and very rarely have had scalability issues.
Jen
Yeah, it's very interesting to say, "Well, just go play around with it. You've got a github account just go ahead and use github pages and start out with a small site of your own and see what happens." Or use it, these kinds of tools in the prototyping process just to make that process faster. Whether or not it means replacing CMSs in client work or in big projects down the road, it remains to be seen perhaps for many many different use cases, but it just feels like this is one of those things that's emerging on the landscape that I feel like has the potential to turn into something really big. Like a really big trend. Which is why I wanted to talk about it today. And kind of find out what the deal is with it. So thank you both for being on the show
Young
Yeah, thank you Jen for having us.
Dave
Absolutely. Thanks!
Jen
People should check you both out. You're both on twitter, what are your twitter handles?
Young
I'm @younghahn. Young like a small boy and H A H N.
Dave
And I'm @dhcole, D H Cole.
Jen
And those two links, along with all these other kinds of links will be in the show notes for this episode which will be at 5by5.tv/webahead/54. Since this is episode 54. You also might want to check out, Alex came on and talked about Maps which is, you guys were mentioning it today you have Mapbox which is a product that you're, or a service I guess that Development Seed has created. Really fascinating to just talk about maps and how maps can be used on the web. Karen McGrane was here on episode 6 to talk about content structure and some of the issues that we were touching on today. She might articulate the opposite view about whether or not content needs to be, how it should be structured, although I don't think it's opposite, I just think it's this, the other need, the other set of needs, I think all of these needs are simultaneously true. And then Josh was on talking about servers and server stacks in episode 26, which I think was actually our most popular episode ever. Memcache, Varnish, all this kind of stuff we were mentioning. If you're interested in hearing more if you haven't already listened. Hearing more about server stacks and how what tools are available to make websites much faster. So, I guess that's it. Thank you again. Thanks everybody for listening and we will be back another day.

Show Notes