admin管理员组

文章数量:1430551

I'm really new to Javascript/jQuery. I've coded in Objective-C and Swift before and there it was possible to parse a (x)html-website with XPath and a framework like Hpple.

Now I have to do something like that in JavaScript(Cloud Code from parse).

My problem is now, that I'd like to parse like that:

var url = "";
var xpath = "//body";
someJavaScriptMagic.parse(url, xpath);

I've often seen people using the document.evaluatemethod, but there they parsed the website on which they were at the moment and not another website.

Is there a way to do that?

I dont know if it's important, but I'm using CloudCode from parse

EDIT:

I've already tried using the ajax-query:

$.ajax({ url: '', success: function(data) { alert(data); } });

But I get the following error each time:

XMLHttpRequest cannot load /. No 'Access-Control-Allow-Origin' header is present on the requested resource. Origin '' is therefore not allowed access.

I'm really new to Javascript/jQuery. I've coded in Objective-C and Swift before and there it was possible to parse a (x)html-website with XPath and a framework like Hpple.

Now I have to do something like that in JavaScript(Cloud Code from parse.).

My problem is now, that I'd like to parse like that:

var url = "http://www.google.";
var xpath = "//body";
someJavaScriptMagic.parse(url, xpath);

I've often seen people using the document.evaluatemethod, but there they parsed the website on which they were at the moment and not another website.

Is there a way to do that?

I dont know if it's important, but I'm using CloudCode from parse.

EDIT:

I've already tried using the ajax-query:

$.ajax({ url: 'http://www.digitec.ch', success: function(data) { alert(data); } });

But I get the following error each time:

XMLHttpRequest cannot load http://www.digitec.ch/. No 'Access-Control-Allow-Origin' header is present on the requested resource. Origin 'http://fiddle.jshell' is therefore not allowed access.
Share Improve this question edited Jan 10, 2015 at 16:14 Christian asked Jan 8, 2015 at 15:14 ChristianChristian 22.4k12 gold badges83 silver badges107 bronze badges 4
  • You want to fetch the website's HTML code from the given URL and then extract elements from that HTML code? – Jens Commented Jan 10, 2015 at 15:23
  • Does it have to be XPath? – Jens Commented Jan 10, 2015 at 15:34
  • If you know an other way where I can use a "parse-String" feel free to show. :) – Christian Commented Jan 10, 2015 at 15:47
  • you NEED to use YQL to do this from JavaScript. it accepts urls, and xpath expressions, and gives you back xml with cors or json that your js can consume. – dandavis Commented Jan 13, 2015 at 1:54
Add a ment  | 

3 Answers 3

Reset to default 2

You can't make AJAX requests (i.e., HTTP requests in JavaScript) to a different domain from the domain that served the resource making the request. In other words, if your JavaScript is served from "foo./some.js", and it is attempting to fetch "google.", it will fail. This is called the Same-origin policy, and it is a fundamental principle in web application security. Read about it here: http://en.wikipedia/wiki/Same-origin_policy. Googling "Access-Control-Allow-Origin" (from your error) will give you much more information about this as well.

You can work around this by making a request to a script on your own domain that serves as a proxy. For example:

foo./some.js

var url = "http://www.google.";
someJavaScriptMagic.get("foo./fetchUrl?url="+url);

Then you had a backend script that accepts that request, and in turn makes an HTTP request to the host specified by the CGI param "url" and returns the HTML.

Take a look at this thread for how to fetch the HTML from a URL.

You can use the jQuery function parseHTML to convert a string into a bunch of DOM objects, and then select elements from those DOM objects.

If you insist on using XPath then you might want to take a look at document.evaluate, or this thread.

I think that SlimerJS will help you.

本文标签: Parse html page from url link with xPath(javascriptjQuery)Stack Overflow