Developing Pinpointer

  • This piece was written over a year ago. It may no longer accurately reflect my views now, or may be factually outdated.

How many times have you told a friend or colleague Go to http://example.com/some/doc and search for XXXX?

This post details the creation of the Pinpointer browser extension, which you can check out here.

The problem

Say you’re reading an article, and you find a certain portion that really speaks to you. Obviously, you want to share it with all of your pals! The article’s pretty long, though, and you’re not sure they’ll be able to find the exact part you want to share with them. Alas, when it comes to sharing links that focus onto a particular section of a page, you currently have only one option: the fragment identifier, as specified in RFC 3986, §3.5. For the rest of this article, I will only be dealing with fragment identifiers for pages with a text/html MIME type—that is, pretty much any web page.

Let’s say you want to show someone this page—easy, you just send them the link (e.g. https://bengoldsworthy.uk/program/Pinpointer). Suppose, though, that you want to direct their attention to a specific element on this page, such as this bit of text. Also easy, you can just send them the link with a fragment identifier on the end (e.g. https://bengoldsworthy.uk/program/Pinpointer/#example1, although bear in mind that it will put the element at the very top of the web browser view, which on this site will leave it hidden under the navigation bar). However, this will only work if the original page creator has deigned to add ids (or now-deprecated names) to all of the interesting bits of the page, which is unlikely. Even better, if they have done that, is that they might have also added a :target rule to their stylesheet that will make it visually apparent where the linkee’s attention should be pointed (e.g. https://bengoldsworthy.uk/program/Pinpointer#example2). However, that’s even less likely than them having added the IDs in the first place. Since you can only link to named anchors or elements with ids[…]if the target page has neither you are out of luck.1

Previous attempts to solve the problem

It seems bizarre to me that there seems to be few existing attempts to solve this irritating problem. Despite JJJ’s claims above that we are out of luck, I have managed to find three possible solutions so far. One, as JJJ later suggests, is to embed the page in an <iframe> element, allowing one to make whatever modifications to the page that they fancy. However, this requires the person sharing the link to have their own web page in which to embed the <iframe>. Considering how many people share links over messaging apps and the like, this seems an unreasonable expectation.

Another solution is the FXPointer extension for Firefox. This extension purports to allow one to specify arbitrary elements using XPointers, which were devised in order to solve this same problem within XML documents back when those were all the rage. However, the latest version—0.2.9—came out in 2009 for Firefox 3.5, so we can safely assume this project is dead.

Finally, this MetaFilter thread includes a suggestion for something called PurpleSlurple. The website seems to be dead now, but from the Wayback Machine I gather that it was some sort of website that would take the link you wanted to share and attach unique IDs to the elements, allowing you to link instead to the PurpleSlurple-hosted page.

EDIT: It turns out I somehow missed another browser extension that aims to provide the same functionality and has an almost identical name to mine, based on a similarly-missed proposal for CSS fragment identifiers, but which hasn’t been updated since 2011. Not sure how I managed that.

The solution (in theory)

Firstly, how can we easily identify DOM elements in our fragment identifiers? If only we had a language specifically designed for selecting arbitrary page elements. Oh wait. I believe, further, that FXPointer had the right idea by opting for the browser extension route. Firstly, I don’t quite have the sway to amend the RFC fragment identifier spec. to encompass CSS selectors. Secondly, using an intermediate website such as PurpleSlurple raises thorny issues of copyright and privacy, as well as being a pain when your linkee wants to post a comment on the article you’ve sent them and has to navigate back to the original site to do so. Thirdly, whilst a browser extension does have the downside of only working for those who have adopted it, this seems like the the smallest price to pay for linking precision. As long as the extension does what it says on the tin efficiently and quietly, and no more, there should be no real barriers to adoption.

So what does this browser extension need to do? First, it needs to add an option to select a given element. Then, it needs to allow a means of moving up and down the DOM in order to highlight the specific element the linker wants—after selecting an li, for example, the user should be able to bump their selection up to the encompassing ul. This done, the unique CSS selector for the specified element should be appended to the page URL and the resulting URL displayed (with a convenient option to copy the text to the clipboard). As a handful of characters used in CSS selectors—#, :, etc.—are listed in RFC 3986 as reserved characters, it will be necessary to either escape them via percent encoding or to encode the whole string (e.g. Base64) and then decode it on the receiving end. Implementing both ought to be trivial, but I prefer the latter option for aesthetic reasons.

That is the link-sharing use case. Onto the link-visiting one.

Firstly, I realised that some way is needed of signifying to the browser extension when a URL contains a CSS selector for it. Ideally, such URLs would be backwards-compatible—if a user without the extension were to click on one, they would still be sent to the page in question, just without the further Pinpointering. The most naive solution I came up with was to have the extension check every URL fragment identifier it sees for a valid CSS selector. Alternatively, I thought of perhaps wrapping the fragment identifier in double curly braces, but they’re forbidden in URLs (RFC 1738). I thought of using zero-width characters, but they are (quite understandably) blacklisted.

Then I realised I was overthinking things. If an invalid fragment identifier is given in a URL, what happens? Nothing. The page loads as normal. Try it out: https://bengoldsworthy.uk/program/Pinpointer#notarealid. Thus, to the user without any augmentary browser extension, there are only three situations regarding a fragment identifier: a identifier containing a valid id (or name) for an element in the page; an invalid identifier; and no identifier. The last two cases are handled identically. Thus, we can use the available space of the second case for our extension. Using the aforementioned :target selector can allow us to identify when a fragment identifier is of the first type, and thus we can ignore it—if any elements are returned for a search of :target, the identifier is valid.

The solution (in practice)

This was pretty much my first attempt at writing a browser extension, so please be gentle if I seem a bit slow here.

I figured the first thing I needed to be able to do was to select something on a web page, and pass that to the extension’s pop-up menu. After a bit of faff, I had the pop-up and the web page chatting away quite merrily. Now, I needed to sort out the selector-generating functionality. I started off by nicking Dave Cardwell’s jQuery-GetPath plugin. The plugin returns the full path to a given element, generated recursively, from <html> downwards and including IDs and class names where they arise. This choice led to a lengthy, but ultimately fruitless, attempt to incorporate jQuery into my extension. After realising I was getting nowhere, I set about rewriting Cardwell’s code in plain old Javascript. I’d always tended to use jQuery without thinking to much about it, so it was interesting to push myself out of my comfort zone.

Once this was done, I could get the path of the selected element and pass it to the pop-up. Whereas Cardwell’s code was interested in generating the full path to an element, I added some timesavers like stopping traversal when a given element had an ID, leading to shorter selectors. Now that I had a selector, I added the functionality for encoding it, appending it to the current URL and presenting that to the user. The link-sharing side of things was largely implemented, and the link-visiting side took little further effort.

I wondered how I was going to implement the means of moving up and down the DOM in order to highlight the specific element the linker wants, and realised that I was going to have to reverse some of the efficiency features I added to the path generation. It’s no use having the option to move up and down the DOM when you truncate the chain of element selectors after one or two. This done, it was easy enough to separate up the parts of the whole selector and allow the user to traverse them at will (both up and down, although no further down than the element that was initially selected). I experimented with a range of ways of displaying the selected elements visually, and I think the opacity: 0.5; border: 2px dashed black; rules are quite effective. What I really wanted was to make everything that wasn’t currently selected invisible, but :not(whatever the selector is) {visibility: hidden; } just serves to make the <html> tag, and by extension everything under it, invisible (unless html was the whole selector, but why would anyone do that?). The only solution I can think of is to go through the full selector, element by element, making all of each given element’s siblings invisible. This seems woefully inefficient, however.

At this point, the link-creation aspect was almost done. The only issue was with the path generation, whenever the selected element had siblings of the same type and with the same class. For example, on the Wikipedia homepage that I was testing on, the From today’s featured article section has multiple ID-less <a>s, many with the same mw-redirect class. First, I added functionality to the path generation that would append an :nth-of-type rule to any element that wasn’t uniquely selectable. I’d clearly forgotten how recursion works, and spent a while trying to troubleshoot why I was getting things like a:nth-of-type(28) for what was clearly the fourth or fifth <a> in the section. Obviously, since the method worked from the final element backwards, it was looking at how many <a>s were in the entire web page. That was soon fixed, and after some further pain I’ve ended up with a kludgy, but functional, solution.

Replies

    1. We owe it to Web annotators. The note’s a part of a larger Recommendation.

      Unfortunately it doesn’t define a suitable mapping between various types of selectors. The standard holds no provision for translation between, e.g. CSS Selectors and Fragment Selectors. It says nothing about extending URL fragments with CSS or XPath selectors.

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.