Extensions – Practice

In the previous post you could read about the main parts of an extension. Short and just the definitions. Let’s see them in practice.

TypeScript

I used TypeScript for the development. We decided to choose this instead of the plain JavaScript because a lot of developer worked on the original project. There were a plenty of misspelling errors – figured out only in run-time and we had problems to keep everything in hand. We needed regulations a more structured and controlled environment for developing in JavaScript. That’s why we choose TypeScript to work with. I only get used to developing with it that’s why this project is in TypeScript as well.

Developing this way requires only a few extra thing to do.

  • You need to use typings (type definitions for JavaScript). The definition files of regular JavaScript libraries. But you don’t need to panic, because it is collector portal and you can find here a lot of typings for different libraries. – really a lot, I found typing for so small libraries I couldn’t believe it first. (:
  • You need to compile typescript files. But doing this is supported by gulp. You can easily add automation to do it with a single gulp library, called gulp-typescript.

Structure

Here is the structure for my source.

path

Well separated parts: Background, Content, UI, Manifest like the architectural parts of an extension + Common and Assets. Files in this structure will be explained in the following sections.

Other infrastructural elements of my solution.

path2

  • Gulpfile for automation
  • package.json for libraries installed with npm
  • tsd.json for type defintions installed with tsd
  • gitignore for github
  • and dist for gulp-generated, package-ready files.

Messaging

We have these parts: Background page, extension’s UI and content script.
What we have in common, they all communicate with each other.
Messaging connects all of them.

So for a good extension skeleton we need a good messaging system.

In my solution I built a base class for all my components that contains the mechanism of messaging. My implementation based on broadcasting.

From content scripts we send a message which one all the background scripts based on my baseClass will get but only the subscribed ones will handle it. One event could be important for more parts of the background components. Same from the extension’s UI.

Background scripts will broadcast their messages to the content scripts and UI scripts as well. UI scripts can’t communicate direct with the content scripts but through the background script they can. Sending the message from UI to the background script and forwarding from the background script to the content script that’s how it could work. 

Every base class have the same functionality.

msg1

In the constructor you pass the event handler of your actual component for the browsers event handler. Browser part will be explained in the next section – these are the common, browser independent parts.

You have this base class with two arrays for all your component’s classes. Listeners and pending responses. Storing them and checking them at every action will provide that you didn’t lost any message or reply in the extension’s messaging traffic.

As an example. I created the test file. It defines its event subscriptions in the constructor. Other classes extending the base class will have own arrays for all of that. Every component will know for which events they interested or for which ones not.

msg2

The actual ‘on’ subscribing method adds only this subscription to the array. Going through the array and checking which events to handle will happen if the browsers listener will get a message.

msg3

Sending a message will be the preparation of a well defined event package and the triggering of the event. Triggering is different in the different browsers that’s why it is only a method here, a method of the browser.

Browser & Component differences of messaging

As I mentioned all the base classes are only duplicates. In the code yes, but in every component we will use a different browser implementations. So we will have browser file for Background scripts, Content Scripts and UI scripts as well. First I will show you these differences. So the different browser implementations of the different components. After that I will show the actual browser differences, that are because of the two different browser, Chrome and Firefox.

Isn’t clear yet? – No problem. I will show everything in practice. I hope it will make things clear.

Background script:browser1

chrome.runtime.onMessage.addListener is the method of the chrome API to handle event subscriptions. If we get a message we will iterate through all of our subscribed class’s callback – remember base class passed in its constructor to the browser implementation – and call them with the actual event data.

Filling this array, the eventListeners array happens through the eventReceived method.

browser2

Sending a message from background scripts has two steps. Sending the actual message to the scripts in the UI and sending them to the content scripts. Both have different ways.

browser3

UI can communicate directly through the run-time with the background script so we can use it to deliver messages from one to other. But as I mentioned it in my previous post, Content scripts are not part of the extensions technically. They will get the messages through the actual page, into they are actually injected. We can get all the pages opened in the browser in our background script with the chrome.tabs.query and we can send the message to all of them. Only the page containing our injected logic with the event handlers will do something with the message the others will ignore it.

Content script & UI scripts (Popup):

Background need to use other method to send message to Content Scripts and UI scripts but they can get messages through the run-time and can use it to sending the messages as well in the same way.

cs1

cs2

The type in the sendMessage is for the browser. Extension communication go through  the type, message. In the payload we can define our type as well. Identifying our messages with our types/names.

UI scripts implementation is for popup types, not for panel types. It is differentiated in chrome. More about it you can find in my skeleton implementation.

 

Messaging in Firefox looks different. In background scripts you have to create a panel as the main component of your extension. It has a port, we can use for sending and getting messages.

In content scripts and UI scripts we have a ‘self‘ or ‘addon‘ object with the port or we can use the window object for sending and listening events. More about it you can find in my skeleton implementation.

Other browser differences

We need to hide more differences not just messaging. So little things like timeouts are different in the browser.

In Chrome:

br2

In Chrome background scripts you can use a fully featured window object, but in Firefox you don’t have the timeout functions on that. You have a timers SDK to use for that.

In Firefox:

br3

In Chrome you can use the chrome.* objects and methods.

In Firefox you need to use the underlying module handler – the built in CommonJs for getting the SDK’s libraries.

br1

More feature that needs different implementations according to the actual browser will come. In the skeleton I needed only these ones. I think in later posts there will be more about this differences. Don’t miss them.

Extension bootstrapping

Bootstrapping the extensions are very different in the two browsers.

Chrome uses the manifest.json for all the things.

manifest

It will bind the Ctrl + B key-combo to start our UI. The UI will be the default_popup’s html. We define the background scripts and the content scripts as well. Chrome will know what to do with them.

Firefox have a few details in the package.json file.

manifest3

It defines only the path of the background script and Firefox will use the background script and the browser’s SDK to create the parts of the extension.

manifest2

You define here the panel and it properties. The icon (= button) in the browser’s toolbar. It handles the show and hide of the extension’s UI. Injecting the content scripts happens also here with the pageMod SDK library.

In case of content scripts the * defines that we want our script injected into every opened tab, we define the path of the script and we want to load and run the script on document ready event.

In Chrome (previous picture) matches: <all_urls> is the same with the * in Firefox. Js defines the path. Run_at is the same as contentScriptWhen. In Chrome we have an extra option to include our script not just into the page but into all of the iframes the page contains. Firefox uses other option to do that.

aaaaa

All the sources mentioned in this article you can find in my repository in github. You can check it: https://github.com/FerencKun/extensionbase

If we are ready with the development we need a good automatized infrastructure to build the files and prepare them for packaging and publishing.

Automations with gulp

Developing an extension for both browsers. Chrome and Firefox. Without gulp could be very hard. Gulp task can help us make the right package for both browser with a little configuration and with parameterized tasks. Here I don’t have space to define what gulp is. Please let me speak about it so as if you would know the basics of this technology and if you didn’t, here are some good links to learn more about gulp.

One big difference is that in Chrome we can add more background scripts but in Firefox we can only define one file. So we need to concat all the background scripts into one.

gulp

Using TypeScript makes our work easier, but for packaging we need of course the compiled JavaScript. Gulp helps.

gulp1

We can add our different browser files with the gulp task in our copy task but we need different path settings in html files. (If we don’t want to rename our destination files.) There is an option to do it with gulp as well.

We can define string patterns (with custom prefixes) to replace with value in that place too. In html I will define //@@browserFilePath and gulp will replace it with chrome.js or with firefox.js according to the parameter.

Now we have the already ready files to package them for the browsers. About packaging and and publishing I will write about later. I hope this much content will be enough for this post. But more info in this theme will be coming soon.

To be continued…

Extensions – Theory

Update: I started to write one post about extension basics but it grown too big. So I decided to write two about it. One about the theory and one about the practice. So this one will be more readable. I promise the other one will contain codes and minimal of talk. You can easily skip this one if you are familiar with extension basics and you are interested only in checking my solutions for developing one.

extensions.PNG

Extensions. Extensions everywhere. This was my previous half year about as I mentioned it in the previous post. It was a good intro to this series I think, so I don’t want to start it with telling stories. Let’s jump into the world of browser extensions.

This post will be about the extension basics, where everything begins:

  • What is an extension exactly?
  • Which browsers are we talking about?
    • Chrome and Firefox
    • IE and Safari
  • What are the extension’s cross browser and reusable parts and what are not?
  • How to start developing extensions for multiple platforms?

What is an extension exactly?

Extensions extend the functionality of the browser and the websites being viewed in it.” – this is the official definition of them.

Extension has little or no UI components. It isn’t their purpose to show a lot of information through visual elements. Their true power is getting and manipulating the tabs and their contents. That is what makes them sexy.

I think you already have some extensions installed. Maybe you didn’t know about them. Let’s check them in your browser.

In Chrome:

In Firefox:

Ok, you checked them. Maybe not so interesting ones (if you didn’t already use extensions). Check this examples:

One really good example is the extension of Evernote. It really uses the actual page’s DOM and makes cool things with it.

Another good one is Meldium. It uses more components of the API. Navigation, new tab opening, text injection and so on.

Now you know what an extension looks like and saw what they can do with your browser. And that was only the tip of the iceberg. But I hope you already started to feel that you need to create one. It is so much potential in one thing like that. Now what? Let’s google it. Let’s start searching for some development help.

Chrome Web app vs. Chrome Extension vs. Firefox Add-on vs. Firefox Extension

These could be the words you will probably find in this context. Which one? What I will create through this articles is something that is called in Chrome as an extension and in Firefox as an add-on. Your eyes work just fine. Chrome extensions are not equal to a Firefox extensions. Chrome extensions are the same as add-ons in Firefox. Firefox extensions are a smaller subset of add-ons with different functionalities. Nice isn’t it?

– If you develop for Firefox as well and search for something on the internet ALWAYS check that it is for extensions or for add-ons generally. Especially on the developer site of Mozilla. The two can use very different SDK-s in different contexts and you can just watch why it isn’t working. –

More about extension vs. app, add-on vs. extension:

Chrome app vs. extension

Firefox add-on vs. extension

I will develop simultaneously a Chrome and a Firefox extension. – I will write ‘extension’, but you know the term differences between the two browsers already.

IE and Safari have different concepts, installation process and have very different APIs to work with. So I skipped working on this platforms. Usage of this browsers is so dawn low.

Now we know what we want to develop. Let’s see what the bricks we can build with are.

Architecture of Extensions

What parts have an extension?

  • Manifest
  • Background page
  • Content script
  • UI pages (Panels and popups)

These are the terms in both cases, Chrome and Firefox. But of course it isn’t so simple. The meanings and implementation details of them are different in the two. This section will be about the definitions with some examples. We will see what are the similarities and differences between the two browsers handling the following components.

Manifest

First of all we need a manifest file. It will be the entry points of the extension creation.

In case of Firefox it is only a sort description file. General info such as name, author… From development site the permissions and the main part could be interesting but the Background script will make the big thing, what is done in the manifest in the case of Chrome.
We have permission almost everything by default. Private-browsing, cross-domain-content and multiprocess options are the editable through this file. The “main” property defines our background script.

Manifest file of a Chrome extension contains a lot more info about the extension itself. It defines beside the general things the content scripts to inject into pages (what and how to do), same with the background scripts. It defines the starting page, the icon and browser action to show/hide our extensions UI (popup). Permissions here are very different.
We have access to nothing by default and we need to list every item that we would like to use.  “tabs”, “cookies”, “storage”, “activeTab”, “webRequest”, “webRequestBlocking”, “clipboardWrite” and so on.

Background page

The name of this part could be a bit tricky because background page especially didn’t need to have actual view – html part. It is the place for the logic behind the extension. It could be only JavaScript file(s) – so it is in my case as well. It has the access to the extension SDK and here can we do most of the work we want to do in the extension. It connects all the other parts of the extension as well.

Background pages have two kinds. One for long running task. We can use persistent background pages. I will use only them. But there is another type you can use so called event pages. Event pages are loaded only when they are needed. With them you can freeing memory and other resources. You will see in a later post why I need the persistent ones. But to change from one type to the other is only one property change in the extension’s manifest file. (https://developer.chrome.com/extensions/event_pages)

arch-1

In the background you can write a reusable code as your business logic for both browser. Differences will come if you want to use the browser itself. There is a different API for the getting the chrome features as getting the Firefox’s one, of course. From setting a timer – setTimeout – to the sending a message to an opened tab in the browser. Background scripts in Chrome own a full featured window object. You can call the window.setTimeout easily. But in Firefox you have a window object with a reduced feature set. For setting a timeout you need to include the SDK’s timer lib. require(“sdk/timers”);

In Chrome you have the chrome object. Chrome.* functionalities are available for you by default. In Firefox you need to use require – Firefox have a built in commonjs infra to handle modules – to get the different SDK parts. More in the implementation details part.

After developing a while and if you start to check your extension in the browser you can check your extensions background page. It will help to debug your extension.

Where can you find your extensions background page?

In Chrome

base-chrome

IN Firefox

base-firefox

UI pages

These are ordinary HTML pages, styled with CSS and powered with some JavaScript. Most of the time these are options pages, which let users customize how the extension works.

We can see UI pages of the extension as small web pages and we could show some staff for the user but they are better used for interacting with the background part.

UI pages have 2 different kinds. One is the so called Popup. This is what the user get if he/she clicks on the icon of our extension in the toolbar and we show the user our standalone mini site.

arch-2

The other kind of UI windows are the Panels. They will be part of the actual page of our browser. Standalone window elements, over-layers on the actual page. In Chrome the content of the panel will be part of the actual page, it will be loaded into an iframe. In Firefox they are standalone dialogs. Not parts of the actual page’s DOM.

Actually in Firefox panels and popups are the same, just otherwise positioned. Popups under the extension icon in the toolbar. Panels to whatever part of the actual page.

In Chrome, Panels and Popups have different behavior in background. Such as sending a message looks different from the two different kind of UI elements. In Firefox these are the same with the same behavior. Firefox uses officially only the term of Panel.

UI pages can be inspected in Chrome. As usual just click on the opened window of the extension UI and you get the Inspect Element function. In Firefox I couldn’t find the possibility to check it.

Content scripts

Content scripts are JavaScript codes that we can inject into the web pages, the user visits. This codes share the window object with the actual page, can read and change the DOM.

arch-cs

Content scripts are parts of the loaded page, not the part of the extension. But it have the ability to communicate with our background page through messages – it is a two way communication of course. They are our spies behind the enemy lines.

In Chrome you can see and also debug your content script.

chromeContent

In Firefox I couldn’t find this option, even with Firebug.

Documentations with more info

You can use for the first hand info the developer sites of Firefox and Chrome as well. Both are well documented. You can find a lot of examples and good descriptions about anything. Firefox documentation was a little confusing for me. Chrome’s was much better. But during the development I think you will find yourself reading both of the developer sites.

 

So these were the main parts of our extension. Short and just the definitions. Next post in this topic will be about my implementation details.

Check it too.