5

I've started playing around with gremlin-python wrapper to interact with my gremlin server.

I did the following steps:

./bin/gremlin.sh

Once the Gremlin console opens up, I loaded configurations using:

graph = JanusGraphFactory.open('conf/gremlin-server/janusgraph-cassandra-es.properties')
g = graph.traversal()
saturn = g.V().has('name', 'saturn')

And the above set of codes in gremlin shell works fine, and I can see verteces listed down, but when I try to do same in python I get an empty graph. The following is my code for python:

graph = Graph()
g = graph.traversal().withRemote(DriverRemoteConnection('ws://localhost:8182/gremlin','g'))
print(g)

It returns : graphtraversalsource[graph[empty]]

Why am I getting empty graph? As far as I feel, it is unable to connect to same Graph source. Is there somthing I'm missing?

Note that in:

JanusGraphFactory.open('conf/gremlin-server/janusgraph-cassandra-es.properties')

the config filename provided is one used to start gremlin server.

Any help is really appreciated.

Thanks

2 Answers 2

13

The reason you are seeing graph[empty] is because that's the actual string representation of the Python graph object -- see the code here. The graph may actually contain data though, so it would be better if it was something like graph[remote] or graph[] instead. I've opened up an issue to address this.

Out of the box, JanusGraph isn't configured for Python. You can find docs on how do this in the Apache TinkerPop docs. First install gremlin-python. Here's the command assuming you're using JanusGraph 0.1.1 which uses TinkerPop 3.2.3:

bin/gremlin-server.sh -i org.apache.tinkerpop gremlin-python 3.2.3

Next modify the conf/gremlin-server/gremlin-server.yaml to add the gremlin-python script engine:

scriptEngines: {
  gremlin-groovy: {
    imports: [java.lang.Math],
    staticImports: [java.lang.Math.PI],
    scripts: [scripts/empty-sample.groovy]},
  gremlin-jython: {},
  gremlin-python: {}
}

To use Gremlin Python, you need to go through a Gremlin Server, so start the JanusGraph pre-packaged distribution:

bin/janusgraph.sh start

From the Gremlin Console:

gremlin> graph = JanusGraphFactory.open('conf/janusgraph-cassandra-es.properties')
==>standardjanusgraph[cassandrathrift:[127.0.0.1]]
gremlin> GraphOfTheGodsFactory.load(graph)
==>null
gremlin> g = graph.traversal()
==>graphtraversalsource[standardjanusgraph[cassandrathrift:[127.0.0.1]], standard]
gremlin> g.V().count()
14:51:58 WARN  org.janusgraph.graphdb.transaction.StandardJanusGraphTx  - Query requires iterating over all vertices [()]. For better performance, use indexes
==>12

Install the Gremlin-Python driver, again matching on the TinkerPop version:

pip install gremlinpython==3.2.3

From the Python 3 shell:

>>> from gremlin_python import statics
>>> from gremlin_python.structure.graph import Graph
>>> from gremlin_python.process.graph_traversal import __
>>> from gremlin_python.process.strategies import *
>>> from gremlin_python.driver.driver_remote_connection import DriverRemoteConnection
>>> graph = Graph()
>>> g = graph.traversal().withRemote(DriverRemoteConnection('ws://localhost:8182/gremlin','g'))
>>> print(graph)
graph[empty]
>>> print(g)
graphtraversalsource[graph[empty]]
>>> g.V().count().next()
12
>>> g.addV('god').property('name', 'mars').property('age', 3500).next()
v[4280]
>>> g.V().count().next()
13

Keep in mind when you are working in the Python shell, the graph traversals are not automatically iterated, so you need to make sure to iterate the traversal with iterate() or next() or toList().

Sign up to request clarification or add additional context in comments.

11 Comments

Thanks for the detailed steps but I'm still facing an issue. When I do bin/janusgraph.sh start Its able to connect to cassandra & es, but timeout on gremlin-server. I went though logs but there was no stacktrace to point out what exactly was error, just that I'm getting time out. I increased the wait time from default 60 to 120 but still same issue. Is that expected? Thanks
Going to my comment of connection timeout, just made a discovery. If I add gremlin-jython: {}, gremlin-python: {} to scriptEngines in conf/gremlin-server/gremlin-server.yaml I face timeout error but without that I dont. But without that, I'm still unable to fetch any results Even g.V().count().next() throws an error KeyError: None
I've edited my post above to add a couple more steps to install the gremlin-python plugin. If you are getting a timeout on the Gremlin Server, try kill its process then start it again with bin/gremlin-server.sh and then share the output in your original question.
I was unable to make it working from bin/janusgraph.sh start but got it working by bin/gremlin-server.sh. Also the error of nothing getting fetched is solved now after using Gremlin-Python version 3.2.3. I was using 3.3.0 prior and maybe version mismatch. But now I have another query, how do you commit changes? I was able to add vertex by doing g.addV('god').property('name', 'mars').property('age', 3500) and my result shows my vertex. But how do I commit? I tried g.addV(label, 'god', 'name', 'mars', 'age', 3000).tx().commit() and that failed. Do I need to create my own Traversal()?
And, adding to my previous point, how do we load GraphSON into gremlin-python? I went to tinkerpop.apache.org/docs/current/reference/#gremlin-python - > Custom Serialization but couldnt understand it. Sorry for bugging so much, and would add to existing question if required, but any help is grately appreciated.
|
1

Your local "g" in the Gremlin Console is an embedded instance of a graph. It therefore "contains" something and is not empty. For your "g" in Python, it is "empty" in the sense that on its own there are no vertices/edges that within it - the vertices/edges are in the remote graph on Gremlin Server that it reflects. I assume that if you were to do a g.V().count() in python you would get the same vertex count back as you would if you did the same in java. If not, then there is some other problem, but do not expect a "remote" graph instance to show vertex/edges of any sort (unless a day comes where gremlin-python is written as a Gremlin virtual machine that has it's own Python native graph databases attached to it - in such a case, "g" would be embedded and thus own vertices/edges and would likely no longer print as "empty").

9 Comments

So do you mean to say that python's grimlin wrapper is unable to fetch the Data/Graph stored on remote server? If that is the case, fetching empty graph seems like not an issue. But if that is case, then how do we fetch the Graph stored on DB, query on it and fetch results using python?
no. it is perfectly capable of getting data from the remote graph. all i'm saying is that it says "empty" because the data is not local. it is analogous to EmptyGraph.instance() in Java. you only use it as a reference to a remote graph that actually holds the data. basically, don't be confused by the label "empty" - it bears no significance to the data that is actually available remotely.
Correct me if m wrong, so you mean it shows empty because it actually doesnt store any data locally, but rather reference my remote dataset? If that is case then as you suggested, g.V().count() should give me some results? The count of remote object right? But even that throws up empty as [['V'], ['count']]
So, I did g.V().count().next(), and now its throwing an exception. KeyError: None. Possible reason might be that my graph instance is actually empty. Any ideas regarding this?
So, I do g.V().count() from gremlin, and that works like a charm. I also did addV() and then tried printing it back, though I didnt commit, and the result stayed the same!!
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.