Intro to gumbopp
Gumbopp is a simple library that wraps Google’s gumbo HTML 5 parser, originally written in C, with a modern C++ interface that should integrate well with the STL. It does so while providing a complete compiler firewall between the underying C library and the C++ interface. Care was taken to make all of the features of the original library available to the user of the C++ interface. The library attempts to be forward thinking and uses some features from C++17, and hopefully as the new standard progresses and compiler support improves, so will gumbopp.
Getting started is simple, just clone the git repo like so:
This installation will provide CMake configuration files to make using the library with CMake as simple as possible. Using the library with CMake is as simple as:
Using the API
Using the api is simple enough, use the
Parser::parse method to parse a
string containing html, then start using
Document object that is returned
to find the nodes that you need. Below is an example:
Do note that in the preceding example, the
std::find_if could be removed and the same node could have been accessed directly by calling
The API also supports iterating through the
Attributes that are defined
on an element, like so:
The source is documented with doxygen, but could probably be improved with more examples, but I believe the API is pretty self discoverable.
The Next Steps
Moving forward with the library, it would be nice to have a way to search through the mini DOM in the spirit of CSS. However, in the mean time, the library should be stable enough for everyday use. As a side note, binary compatability should be easy enough to maintain going forward. Stay tuned for some further announcements.