Extensions – Theory

Update: I started to write one post about extension basics but it grown too big. So I decided to write two about it. One about the theory and one about the practice. So this one will be more readable. I promise the other one will contain codes and minimal of talk. You can easily skip this one if you are familiar with extension basics and you are interested only in checking my solutions for developing one.

extensions.PNG

Extensions. Extensions everywhere. This was my previous half year about as I mentioned it in the previous post. It was a good intro to this series I think, so I don’t want to start it with telling stories. Let’s jump into the world of browser extensions.

This post will be about the extension basics, where everything begins:

  • What is an extension exactly?
  • Which browsers are we talking about?
    • Chrome and Firefox
    • IE and Safari
  • What are the extension’s cross browser and reusable parts and what are not?
  • How to start developing extensions for multiple platforms?

What is an extension exactly?

Extensions extend the functionality of the browser and the websites being viewed in it.” – this is the official definition of them.

Extension has little or no UI components. It isn’t their purpose to show a lot of information through visual elements. Their true power is getting and manipulating the tabs and their contents. That is what makes them sexy.

I think you already have some extensions installed. Maybe you didn’t know about them. Let’s check them in your browser.

In Chrome:

In Firefox:

Ok, you checked them. Maybe not so interesting ones (if you didn’t already use extensions). Check this examples:

One really good example is the extension of Evernote. It really uses the actual page’s DOM and makes cool things with it.

Another good one is Meldium. It uses more components of the API. Navigation, new tab opening, text injection and so on.

Now you know what an extension looks like and saw what they can do with your browser. And that was only the tip of the iceberg. But I hope you already started to feel that you need to create one. It is so much potential in one thing like that. Now what? Let’s google it. Let’s start searching for some development help.

Chrome Web app vs. Chrome Extension vs. Firefox Add-on vs. Firefox Extension

These could be the words you will probably find in this context. Which one? What I will create through this articles is something that is called in Chrome as an extension and in Firefox as an add-on. Your eyes work just fine. Chrome extensions are not equal to a Firefox extensions. Chrome extensions are the same as add-ons in Firefox. Firefox extensions are a smaller subset of add-ons with different functionalities. Nice isn’t it?

– If you develop for Firefox as well and search for something on the internet ALWAYS check that it is for extensions or for add-ons generally. Especially on the developer site of Mozilla. The two can use very different SDK-s in different contexts and you can just watch why it isn’t working. –

More about extension vs. app, add-on vs. extension:

Chrome app vs. extension

Firefox add-on vs. extension

I will develop simultaneously a Chrome and a Firefox extension. – I will write ‘extension’, but you know the term differences between the two browsers already.

IE and Safari have different concepts, installation process and have very different APIs to work with. So I skipped working on this platforms. Usage of this browsers is so dawn low.

Now we know what we want to develop. Let’s see what the bricks we can build with are.

Architecture of Extensions

What parts have an extension?

  • Manifest
  • Background page
  • Content script
  • UI pages (Panels and popups)

These are the terms in both cases, Chrome and Firefox. But of course it isn’t so simple. The meanings and implementation details of them are different in the two. This section will be about the definitions with some examples. We will see what are the similarities and differences between the two browsers handling the following components.

Manifest

First of all we need a manifest file. It will be the entry points of the extension creation.

In case of Firefox it is only a sort description file. General info such as name, author… From development site the permissions and the main part could be interesting but the Background script will make the big thing, what is done in the manifest in the case of Chrome.
We have permission almost everything by default. Private-browsing, cross-domain-content and multiprocess options are the editable through this file. The “main” property defines our background script.

Manifest file of a Chrome extension contains a lot more info about the extension itself. It defines beside the general things the content scripts to inject into pages (what and how to do), same with the background scripts. It defines the starting page, the icon and browser action to show/hide our extensions UI (popup). Permissions here are very different.
We have access to nothing by default and we need to list every item that we would like to use.  “tabs”, “cookies”, “storage”, “activeTab”, “webRequest”, “webRequestBlocking”, “clipboardWrite” and so on.

Background page

The name of this part could be a bit tricky because background page especially didn’t need to have actual view – html part. It is the place for the logic behind the extension. It could be only JavaScript file(s) – so it is in my case as well. It has the access to the extension SDK and here can we do most of the work we want to do in the extension. It connects all the other parts of the extension as well.

Background pages have two kinds. One for long running task. We can use persistent background pages. I will use only them. But there is another type you can use so called event pages. Event pages are loaded only when they are needed. With them you can freeing memory and other resources. You will see in a later post why I need the persistent ones. But to change from one type to the other is only one property change in the extension’s manifest file. (https://developer.chrome.com/extensions/event_pages)

arch-1

In the background you can write a reusable code as your business logic for both browser. Differences will come if you want to use the browser itself. There is a different API for the getting the chrome features as getting the Firefox’s one, of course. From setting a timer – setTimeout – to the sending a message to an opened tab in the browser. Background scripts in Chrome own a full featured window object. You can call the window.setTimeout easily. But in Firefox you have a window object with a reduced feature set. For setting a timeout you need to include the SDK’s timer lib. require(“sdk/timers”);

In Chrome you have the chrome object. Chrome.* functionalities are available for you by default. In Firefox you need to use require – Firefox have a built in commonjs infra to handle modules – to get the different SDK parts. More in the implementation details part.

After developing a while and if you start to check your extension in the browser you can check your extensions background page. It will help to debug your extension.

Where can you find your extensions background page?

In Chrome

base-chrome

IN Firefox

base-firefox

UI pages

These are ordinary HTML pages, styled with CSS and powered with some JavaScript. Most of the time these are options pages, which let users customize how the extension works.

We can see UI pages of the extension as small web pages and we could show some staff for the user but they are better used for interacting with the background part.

UI pages have 2 different kinds. One is the so called Popup. This is what the user get if he/she clicks on the icon of our extension in the toolbar and we show the user our standalone mini site.

arch-2

The other kind of UI windows are the Panels. They will be part of the actual page of our browser. Standalone window elements, over-layers on the actual page. In Chrome the content of the panel will be part of the actual page, it will be loaded into an iframe. In Firefox they are standalone dialogs. Not parts of the actual page’s DOM.

Actually in Firefox panels and popups are the same, just otherwise positioned. Popups under the extension icon in the toolbar. Panels to whatever part of the actual page.

In Chrome, Panels and Popups have different behavior in background. Such as sending a message looks different from the two different kind of UI elements. In Firefox these are the same with the same behavior. Firefox uses officially only the term of Panel.

UI pages can be inspected in Chrome. As usual just click on the opened window of the extension UI and you get the Inspect Element function. In Firefox I couldn’t find the possibility to check it.

Content scripts

Content scripts are JavaScript codes that we can inject into the web pages, the user visits. This codes share the window object with the actual page, can read and change the DOM.

arch-cs

Content scripts are parts of the loaded page, not the part of the extension. But it have the ability to communicate with our background page through messages – it is a two way communication of course. They are our spies behind the enemy lines.

In Chrome you can see and also debug your content script.

chromeContent

In Firefox I couldn’t find this option, even with Firebug.

Documentations with more info

You can use for the first hand info the developer sites of Firefox and Chrome as well. Both are well documented. You can find a lot of examples and good descriptions about anything. Firefox documentation was a little confusing for me. Chrome’s was much better. But during the development I think you will find yourself reading both of the developer sites.

 

So these were the main parts of our extension. Short and just the definitions. Next post in this topic will be about my implementation details.

Check it too.

Advertisements

2 thoughts on “Extensions – Theory

  1. Question:
    Do we have the option to reach the functions of the site’s scripts from ContentScript or from the page’s script one of the functions of the ContentScript?

    Answer:
    No. Content scripts execute in a special environment called an isolated world. They have access to the DOM of the page they are injected into, but not to any JavaScript variables or functions created by the page. It looks to each content script as if there is no other JavaScript executing on the page it is running on. The same is true in reverse: JavaScript running on the page cannot call any functions or access any variables defined by content scripts.

    Resource:
    https://developer.chrome.com/extensions/content_scripts#execution-environment

  2. Pingback: Browse Wisely – The Extension | My Journey InTo JavaScript

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s