0

Probably a n00b question :-). I'm looking at node-http-proxy to create a filtering proxy. Looking for the string manipulation example I found the pointer to harmon and ran their example successfully.

Then I tried running my own example against an Apache HTTP listening on localhost:80. Here is my code:

    var httpProxy = require('http-proxy');

    // Create an array of selects that harmon will process. 
    var actions = [];

    var simpleaction = {};
    simpleaction.query = 'head';
    simpleaction.func = function (node) {
                             var out = '<style type="text/css"> h1 {color : red; border-bottom : 5px solid green} </style>';
                            node.createWriteStream({ outer: true }).end(out);
                            console.log("head function called:" + out);
                            };
    var simpleaction2 = { 'query' : 'body',
                         'func' : function (node) {
                             var out = '<h1>You have been proxied</h1>';
                            node.createWriteStream({ outer: false }).end(out);
                            console.log("body function called" + out);
                            }
                        };                  

    // Add the action to the action array
    actions.push(simpleaction);
    actions.push(simpleaction2);

    var proxy = httpProxy.createServer(
      require('harmon')([], actions),
      80, 'localhost'
    );

    proxy.listen(8899);

    console.log("Up and running on port 8899");

Initially I got an error since I was using a newer version of http-proxy. Using 0.8.7 fixed that. The console output when loading a page now is:

stw@devmachine:~/tests$ nodejs ptest.js 
Up and running on port 8899
head function called:<style type="text/css"> h1 {color : red; border-bottom : 5px solid green} </style>
body function called<h1>You have been proxied
head function called:<style type="text/css"> h1 {color : red; border-bottom : 5px solid green} </style>
body function called<h1>You have been proxied
head function called:<style type="text/css"> h1 {color : red; border-bottom : 5px solid green} </style>
body function called<h1>You have been proxied
head function called:<style type="text/css"> h1 {color : red; border-bottom : 5px solid green} </style>
body function called<h1>You have been proxied</h1>

So it looks good, but the output isn't changed at all. What did I miss out?

Ultimately I need to:

  • add a stylesheet to the <head> section
  • replace all src and href attributes
  • add some DOM elements at specific places (e.g. a <h1> as first element of the body)
  • add some headers
  • GZIP the result before sending out
  • work on http and https URLs
  • leave images resources alone

Pointers appreciated!

1 Answer 1

1

Seems strange Not having the full environment for your test means I can't give a definitive answer. But I have put together a POC here that has two actions similar to the ones you are creating.

https://gist.github.com/No9/10874082

That works fine if you clone harmon from git and drop it into the test folder.

$ git clone https://github.com/No9/harmon.git $ cd harmon/test $ curl https://gist.githubusercontent.com/No9/10874082/raw/38a26d15b7ecbd875eee0988c94af0333927b98a/host-multiaction.js > host-multiaction.js $ node host-multiaction.js

For the overall functionality with GZIP etc I would say the new version of node-proxy would be better positioned to provide a complete solution.

I suggest watching https://github.com/No9/harmon/issues/8 for the next couple of weeks ;)

Sign up to request clarification or add additional context in comments.

3 Comments

I Saw that in your content you were querying classes like ".b", while I queried elements like "a" or "body". Also I'm going after a real website. I copied my file into the test folder and found it also doesn't honour the request headers and shows the wrong site. Have a look: gist.github.com/Stwissel/10896955 (You should be able to run that from your test folder - compare to the direct URL notessensei.com
So when I run your sample against www.venshare.com it works fine so it's only the request headers that need honouring now. Created this issue github.com/No9/harmon/issues/9 to track it as I am going to deal with the upgrade to http-proxy 1.x first
Interesting, works here too. May I make a wild guess: Venshare is 79 lines of HTML result. The other website is about 300 lines (and if you look at the real notessensei, it is about 1000 lines -> could be the chunking? Also for my understanding: my sample seems to replace the respective node - I just want to add a child.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.