Skip to content

Commit 52634bb

Browse files
authored
Merge pull request #1896 from marklogic/feature/mle-27018-change-vecscore-cosine-cosinedistance
MLE-27018 update cosine, cosineDistance and vectorScore to match codegen
2 parents 455048a + d7c6ad5 commit 52634bb

3 files changed

Lines changed: 156 additions & 50 deletions

File tree

marklogic-client-api/src/main/java/com/marklogic/client/expression/VecExpr.java

Lines changed: 55 additions & 40 deletions
Original file line numberDiff line numberDiff line change
@@ -57,30 +57,26 @@ public interface VecExpr {
5757
*/
5858
public ServerExpression base64Encode(ServerExpression vector1);
5959

60-
/**
61-
* Returns the cosine similarity between two vectors. The vectors must be of the same dimension.
62-
*
63-
* <a name="ml-server-type-cosine"></a>
64-
*
65-
* <p>
66-
* Provides a client interface to the <a href="http://docs.marklogic.com/vec:cosine" target="mlserverdoc">vec:cosine</a> server function.
67-
*
68-
* @param vector1 The vector from which to calculate the cosine similarity with vector2. (of <a href="{@docRoot}/doc-files/types/vec_vector.html">vec:vector</a>)
69-
* @param vector2 The vector from which to calculate the cosine similarity with vector1. (of <a href="{@docRoot}/doc-files/types/vec_vector.html">vec:vector</a>)
70-
* @return a server expression with the <a href="{@docRoot}/doc-files/types/xs_double.html">xs:double</a> server data type
71-
* @since 7.2.0
72-
*/
73-
public ServerExpression cosine(ServerExpression vector1, ServerExpression vector2);
60+
/**
61+
* Returns the cosine of the angle between two vectors. The vectors must be of the same dimension.
62+
* <p>
63+
* Provides a client interface to the <a href="http://docs.marklogic.com/vec:cosine" target="mlserverdoc">vec:cosine</a> server function.
64+
* @param vector1 The vector from which to calculate the cosine with vector2. (of <a href="{@docRoot}/doc-files/types/vec_vector.html">vec:vector</a>)
65+
* @param vector2 The vector from which to calculate the cosine with vector1. (of <a href="{@docRoot}/doc-files/types/vec_vector.html">vec:vector</a>)
66+
* @return a server expression with the <a href="{@docRoot}/doc-files/types/xs_double.html">xs:double</a> server data type
67+
* @since 7.2.0
68+
*/
69+
public ServerExpression cosine(ServerExpression vector1, ServerExpression vector2);
7470

75-
/**
76-
* Return the distance between two vectors. The vectors must be of the same dimension.
77-
*
78-
* @param vector1 The vector from which to calculate the cosine distance with vector2. (of <a href="{@docRoot}/doc-files/types/vec_vector.html">vec:vector</a>)
79-
* @param vector2 The vector from which to calculate the cosine distance with vector1. (of <a href="{@docRoot}/doc-files/types/vec_vector.html">vec:vector</a>)
80-
* @return a server expression with the <a href="{@docRoot}/doc-files/types/xs_double.html">xs:double</a> server data type
81-
* @since 7.2.0
82-
*/
83-
public ServerExpression cosineDistance(ServerExpression vector1, ServerExpression vector2);
71+
/**
72+
* Returns the cosine distance between two vectors. The vectors must be of the same dimension.
73+
*
74+
* @param vector1 The vector from which to calculate the cosine distance with vector2. (of <a href="{@docRoot}/doc-files/types/vec_vector.html">vec:vector</a>)
75+
* @param vector2 The vector from which to calculate the cosine distance with vector1. (of <a href="{@docRoot}/doc-files/types/vec_vector.html">vec:vector</a>)
76+
* @return a server expression with the <a href="{@docRoot}/doc-files/types/xs_double.html">xs:double</a> server data type
77+
* @since 7.2.0
78+
*/
79+
public ServerExpression cosineDistance(ServerExpression vector1, ServerExpression vector2);
8480

8581
/**
8682
* Returns the dimension of the vector passed in.
@@ -246,44 +242,63 @@ public interface VecExpr {
246242
*/
247243
public ServerExpression vector(ServerExpression values);
248244
/**
249-
* A helper function that returns a hybrid score using a cts score and a vector similarity calculation result. You can tune the effect of the vector similarity on the score using the similarityWeight option. The ideal value for similarityWeight depends on your application.
245+
* A helper function that returns a hybrid score using a cts score and a vector distance calculation result. You can tune the effect of the vector distance on the score using the distanceWeight option. The ideal value for distanceWeight depends on your application. The hybrid score is calculated using the formula: score = weight * annScore + (1 - weight) * ctsScore. - annScore is derived from the distance and distanceWeight, where a larger distanceWeight reduces the annScore for the same distance. - weight determines the contribution of the annScore and ctsScore to the final score. A weight of 0.5 balances both equally. This formula allows you to combine traditional cts scoring with vector-based distance scoring, providing a flexible way to rank results.
250246
* <p>
251247
* Provides a client interface to the <a href="http://docs.marklogic.com/vec:vector-score" target="mlserverdoc">vec:vector-score</a> server function.
252248
* @param score The cts:score of the matching document. (of <a href="{@docRoot}/doc-files/types/xs_unsignedInt.html">xs:unsignedInt</a>)
253-
* @param similarity The similarity between the vector in the matching document and the query vector. The result of a call to ovec:cosine(). In the case that the vectors are normalized, pass ovec:dot-product(). Note that vec:euclidean-distance() should not be used here. (of <a href="{@docRoot}/doc-files/types/xs_double.html">xs:double</a>)
249+
* @param distance The distance between the vector in the matching document and the query vector. Examples, the result of a call to ovec:cosine-distance() or ovec:euclidean-distance(). (of <a href="{@docRoot}/doc-files/types/xs_double.html">xs:double</a>)
254250
* @return a server expression with the <a href="{@docRoot}/doc-files/types/xs_unsignedLong.html">xs:unsignedLong</a> server data type
255251
*/
256-
public ServerExpression vectorScore(ServerExpression score, double similarity);
252+
public ServerExpression vectorScore(ServerExpression score, double distance);
257253
/**
258-
* A helper function that returns a hybrid score using a cts score and a vector similarity calculation result. You can tune the effect of the vector similarity on the score using the similarityWeight option. The ideal value for similarityWeight depends on your application.
259-
*
260-
* <a name="ml-server-type-vector-score"></a>
261-
254+
* A helper function that returns a hybrid score using a cts score and a vector distance calculation result. You can tune the effect of the vector distance on the score using the distanceWeight option. The ideal value for distanceWeight depends on your application. The hybrid score is calculated using the formula: score = weight * annScore + (1 - weight) * ctsScore. - annScore is derived from the distance and distanceWeight, where a larger distanceWeight reduces the annScore for the same distance. - weight determines the contribution of the annScore and ctsScore to the final score. A weight of 0.5 balances both equally. This formula allows you to combine traditional cts scoring with vector-based distance scoring, providing a flexible way to rank results.
255+
* <p>
256+
* Provides a client interface to the <a href="http://docs.marklogic.com/vec:vector-score" target="mlserverdoc">vec:vector-score</a> server function.
257+
* @param score The cts:score of the matching document. (of <a href="{@docRoot}/doc-files/types/xs_unsignedInt.html">xs:unsignedInt</a>)
258+
* @param distance The distance between the vector in the matching document and the query vector. Examples, the result of a call to ovec:cosine-distance() or ovec:euclidean-distance(). (of <a href="{@docRoot}/doc-files/types/xs_double.html">xs:double</a>)
259+
* @return a server expression with the <a href="{@docRoot}/doc-files/types/xs_unsignedLong.html">xs:unsignedLong</a> server data type
260+
*/
261+
public ServerExpression vectorScore(ServerExpression score, ServerExpression distance);
262+
/**
263+
* A helper function that returns a hybrid score using a cts score and a vector distance calculation result. You can tune the effect of the vector distance on the score using the distanceWeight option. The ideal value for distanceWeight depends on your application. The hybrid score is calculated using the formula: score = weight * annScore + (1 - weight) * ctsScore. - annScore is derived from the distance and distanceWeight, where a larger distanceWeight reduces the annScore for the same distance. - weight determines the contribution of the annScore and ctsScore to the final score. A weight of 0.5 balances both equally. This formula allows you to combine traditional cts scoring with vector-based distance scoring, providing a flexible way to rank results.
264+
* <p>
265+
* Provides a client interface to the <a href="http://docs.marklogic.com/vec:vector-score" target="mlserverdoc">vec:vector-score</a> server function.
266+
* @param score The cts:score of the matching document. (of <a href="{@docRoot}/doc-files/types/xs_unsignedInt.html">xs:unsignedInt</a>)
267+
* @param distance The distance between the vector in the matching document and the query vector. Examples, the result of a call to ovec:cosine-distance() or ovec:euclidean-distance(). (of <a href="{@docRoot}/doc-files/types/xs_double.html">xs:double</a>)
268+
* @param distanceWeight The weight of the vector distance on the annScore. This value is a positive coefficient that scales the distance. A larger distanceWeight produces a lower annScore for the same distance. The default value is 1. (of <a href="{@docRoot}/doc-files/types/xs_double.html">xs:double</a>)
269+
* @return a server expression with the <a href="{@docRoot}/doc-files/types/xs_unsignedLong.html">xs:unsignedLong</a> server data type
270+
*/
271+
public ServerExpression vectorScore(ServerExpression score, double distance, double distanceWeight);
272+
/**
273+
* A helper function that returns a hybrid score using a cts score and a vector distance calculation result. You can tune the effect of the vector distance on the score using the distanceWeight option. The ideal value for distanceWeight depends on your application. The hybrid score is calculated using the formula: score = weight * annScore + (1 - weight) * ctsScore. - annScore is derived from the distance and distanceWeight, where a larger distanceWeight reduces the annScore for the same distance. - weight determines the contribution of the annScore and ctsScore to the final score. A weight of 0.5 balances both equally. This formula allows you to combine traditional cts scoring with vector-based distance scoring, providing a flexible way to rank results.
262274
* <p>
263275
* Provides a client interface to the <a href="http://docs.marklogic.com/vec:vector-score" target="mlserverdoc">vec:vector-score</a> server function.
264276
* @param score The cts:score of the matching document. (of <a href="{@docRoot}/doc-files/types/xs_unsignedInt.html">xs:unsignedInt</a>)
265-
* @param similarity The similarity between the vector in the matching document and the query vector. The result of a call to ovec:cosine(). In the case that the vectors are normalized, pass ovec:dot-product(). Note that vec:euclidean-distance() should not be used here. (of <a href="{@docRoot}/doc-files/types/xs_double.html">xs:double</a>)
277+
* @param distance The distance between the vector in the matching document and the query vector. Examples, the result of a call to ovec:cosine-distance() or ovec:euclidean-distance(). (of <a href="{@docRoot}/doc-files/types/xs_double.html">xs:double</a>)
278+
* @param distanceWeight The weight of the vector distance on the annScore. This value is a positive coefficient that scales the distance. A larger distanceWeight produces a lower annScore for the same distance. The default value is 1. (of <a href="{@docRoot}/doc-files/types/xs_double.html">xs:double</a>)
266279
* @return a server expression with the <a href="{@docRoot}/doc-files/types/xs_unsignedLong.html">xs:unsignedLong</a> server data type
267280
*/
268-
public ServerExpression vectorScore(ServerExpression score, ServerExpression similarity);
281+
public ServerExpression vectorScore(ServerExpression score, ServerExpression distance, ServerExpression distanceWeight);
269282
/**
270-
* A helper function that returns a hybrid score using a cts score and a vector similarity calculation result. You can tune the effect of the vector similarity on the score using the similarityWeight option. The ideal value for similarityWeight depends on your application.
283+
* A helper function that returns a hybrid score using a cts score and a vector distance calculation result. You can tune the effect of the vector distance on the score using the distanceWeight option. The ideal value for distanceWeight depends on your application. The hybrid score is calculated using the formula: score = weight * annScore + (1 - weight) * ctsScore. - annScore is derived from the distance and distanceWeight, where a larger distanceWeight reduces the annScore for the same distance. - weight determines the contribution of the annScore and ctsScore to the final score. A weight of 0.5 balances both equally. This formula allows you to combine traditional cts scoring with vector-based distance scoring, providing a flexible way to rank results.
271284
* <p>
272285
* Provides a client interface to the <a href="http://docs.marklogic.com/vec:vector-score" target="mlserverdoc">vec:vector-score</a> server function.
273286
* @param score The cts:score of the matching document. (of <a href="{@docRoot}/doc-files/types/xs_unsignedInt.html">xs:unsignedInt</a>)
274-
* @param similarity The similarity between the vector in the matching document and the query vector. The result of a call to ovec:cosine(). In the case that the vectors are normalized, pass ovec:dot-product(). Note that vec:euclidean-distance() should not be used here. (of <a href="{@docRoot}/doc-files/types/xs_double.html">xs:double</a>)
275-
* @param similarityWeight The weight of the vector similarity on the score. The default value is 0.1. If 0.0 is passed in, vector similarity has no effect. If passed a value less than 0.0 or greater than 1.0, throw VEC-VECTORSCORE. (of <a href="{@docRoot}/doc-files/types/xs_double.html">xs:double</a>)
287+
* @param distance The distance between the vector in the matching document and the query vector. Examples, the result of a call to ovec:cosine-distance() or ovec:euclidean-distance(). (of <a href="{@docRoot}/doc-files/types/xs_double.html">xs:double</a>)
288+
* @param distanceWeight The weight of the vector distance on the annScore. This value is a positive coefficient that scales the distance. A larger distanceWeight produces a lower annScore for the same distance. The default value is 1. (of <a href="{@docRoot}/doc-files/types/xs_double.html">xs:double</a>)
289+
* @param weight The weight of the annScore in the final hybrid score. This value is a coefficient between 0 and 1, where 0 gives full weight to the cts score and 1 gives full weight to the annScore. The default value is 0.5. (of <a href="{@docRoot}/doc-files/types/xs_double.html">xs:double</a>)
276290
* @return a server expression with the <a href="{@docRoot}/doc-files/types/xs_unsignedLong.html">xs:unsignedLong</a> server data type
277291
*/
278-
public ServerExpression vectorScore(ServerExpression score, double similarity, double similarityWeight);
292+
public ServerExpression vectorScore(ServerExpression score, double distance, double distanceWeight, double weight);
279293
/**
280-
* A helper function that returns a hybrid score using a cts score and a vector similarity calculation result. You can tune the effect of the vector similarity on the score using the similarityWeight option. The ideal value for similarityWeight depends on your application.
294+
* A helper function that returns a hybrid score using a cts score and a vector distance calculation result. You can tune the effect of the vector distance on the score using the distanceWeight option. The ideal value for distanceWeight depends on your application. The hybrid score is calculated using the formula: score = weight * annScore + (1 - weight) * ctsScore. - annScore is derived from the distance and distanceWeight, where a larger distanceWeight reduces the annScore for the same distance. - weight determines the contribution of the annScore and ctsScore to the final score. A weight of 0.5 balances both equally. This formula allows you to combine traditional cts scoring with vector-based distance scoring, providing a flexible way to rank results.
281295
* <p>
282296
* Provides a client interface to the <a href="http://docs.marklogic.com/vec:vector-score" target="mlserverdoc">vec:vector-score</a> server function.
283297
* @param score The cts:score of the matching document. (of <a href="{@docRoot}/doc-files/types/xs_unsignedInt.html">xs:unsignedInt</a>)
284-
* @param similarity The similarity between the vector in the matching document and the query vector. The result of a call to ovec:cosine(). In the case that the vectors are normalized, pass ovec:dot-product(). Note that vec:euclidean-distance() should not be used here. (of <a href="{@docRoot}/doc-files/types/xs_double.html">xs:double</a>)
285-
* @param similarityWeight The weight of the vector similarity on the score. The default value is 0.1. If 0.0 is passed in, vector similarity has no effect. If passed a value less than 0.0 or greater than 1.0, throw VEC-VECTORSCORE. (of <a href="{@docRoot}/doc-files/types/xs_double.html">xs:double</a>)
298+
* @param distance The distance between the vector in the matching document and the query vector. Examples, the result of a call to ovec:cosine-distance() or ovec:euclidean-distance(). (of <a href="{@docRoot}/doc-files/types/xs_double.html">xs:double</a>)
299+
* @param distanceWeight The weight of the vector distance on the annScore. This value is a positive coefficient that scales the distance. A larger distanceWeight produces a lower annScore for the same distance. The default value is 1. (of <a href="{@docRoot}/doc-files/types/xs_double.html">xs:double</a>)
300+
* @param weight The weight of the annScore in the final hybrid score. This value is a coefficient between 0 and 1, where 0 gives full weight to the cts score and 1 gives full weight to the annScore. The default value is 0.5. (of <a href="{@docRoot}/doc-files/types/xs_double.html">xs:double</a>)
286301
* @return a server expression with the <a href="{@docRoot}/doc-files/types/xs_unsignedLong.html">xs:unsignedLong</a> server data type
287302
*/
288-
public ServerExpression vectorScore(ServerExpression score, ServerExpression similarity, ServerExpression similarityWeight);
303+
public ServerExpression vectorScore(ServerExpression score, ServerExpression distance, ServerExpression distanceWeight, ServerExpression weight);
289304
}

0 commit comments

Comments
 (0)