Intuitively, \mmt is a declarative language for theories and views over an arbitrary object language.

Its treatment of object languages is abstract enough to subsume most practically relevant logics and type theories.

...

...

@@ -140,59 +138,63 @@ Defined constants can be eliminated by definition expansion.

%

%\end{oldpart}

\subsection{Proof Assistant Libraries in MMT}\label{sec:oaf}

As part of the OAF project~\cite{OAFproject:on}, we have imported several proof assistant libraries into the MMT system. To motivate some of the design choices made in this paper, we will outline the general procedure behind these imports.

\paragraph{} First, we formalize the core logical foundation of the system. We do so by using the logical framework LF~\cite{lf} (at its core a dependently-typed lambda calculus) and various extensions thereof, which are implemented in and supported by the MMT system. In LF, we can formalize the foundational primitives using the usual judgments-as-types and higher-order abstract syntax encodings -- hence theorems and axioms are represented as constants with a type $\vdash P$ for some proposition $P$, and primitive constructs like lambdas are formalized as LF-functions taking LF-lambda-expressions -- which serve as a general encoding of any variable binders -- as arguments.

The resulting formalizations are then used as meta-theory for imports of the libraries of the system under consideration. This results in a theory graph as in Figure \ref{fig:oaf}.

\begin{figure}[ht]\centering

\begin{tikzpicture}

\node (MMT) at (2,2.5) {MMT};

\draw[fill=orange!40] (2,1) ellipse (1.5cm and .6cm);

\node[color=orange] at (-3.3,1) {Logical Frameworks};

\node (L) at (1,1) {LF};

\node (Lx) at (3,1) {LF+X};

\draw[arrow](MMT) -- (L);

\draw[arrow](MMT) -- (Lx);

\draw[arrow](L) -- (Lx);

\draw[fill=red!40] (2,-.5) ellipse (3.2cm and .6cm);

\node[color=red] at (-3.3,-.5) {Foundations};

\node at (2,-.7) {\ldots};

\draw[fill=blue!40] (0,-2.25) ellipse (1.9cm and .8cm);

\node (H) at (0,-.5) {HOL Light};

\node[color=blue!80] at (-3.5,-2) {HOL Light library};

\node (B) at (-1,-2) {Bool};

\node (A) at (1,-2) {Arith};

\node (E) at (0,-2.5) {\ldots};

\draw[arrow](L) -- (H);

\draw[arrow](H) -- (B);

\draw[arrow](H) -- (A);

\draw[arrow](B) -- (A);

\draw[fill=olive!40] (4,-2.25) ellipse (1.9cm and .8cm);

\node (M) at (4,-.5) {PVS};

\node[color=olive] at (-3.3,-2.5) {PVS library};

\node (B') at (3,-2) {Booleans};

\node (A') at (5,-2) {Reals};

\node (E') at (4,-2.5) {\ldots};

\node (A) at (1,-2) {Arith};

\node (E) at (0,-2.5) {\ldots};

\draw[arrow](Lx) -- (M);

\draw[arrow](M) -- (B');

\draw[arrow](M) -- (A');

\draw[arrow](B') -- (A');

\end{tikzpicture}

\caption{A (Simplified) Theory Graph for the OAF Project}\label{fig:oaf}

\end{figure}

%\subsection{Proof Assistant Libraries in MMT}\label{sec:oaf}

%

%As part of the OAF project~\cite{OAFproject:on}, we have imported several proof assistant libraries into the MMT system. To motivate some of the design choices made in this paper, we will outline the general procedure behind these imports.

%

%\paragraph{} First, we formalize the core logical foundation of the system. We do so by using the logical framework LF~\cite{lf} (at its core a dependently-typed lambda calculus) and various extensions thereof, which are implemented in and supported by the MMT system. In LF, we can formalize the foundational primitives using the usual judgments-as-types and higher-order abstract syntax encodings -- hence theorems and axioms are represented as constants with a type $\vdash P$ for some proposition $P$, and primitive constructs like lambdas are formalized as LF-functions taking LF-lambda-expressions -- which serve as a general encoding of any variable binders -- as arguments.

%

%The resulting formalizations are then used as meta-theory for imports of the libraries of the system under consideration. This results in a theory graph as in Figure \ref{fig:oaf}.

%

%\begin{figure}[ht]\centering

% \begin{tikzpicture}

% \node (MMT) at (2,2.5) {MMT};

%

% \draw[fill=orange!40] (2,1) ellipse (1.5cm and .6cm);

% \node[color=orange] at (-3.3,1) {Logical Frameworks};

% \node (L) at (1,1) {LF};

% \node (Lx) at (3,1) {LF+X};

% \draw[arrow](MMT) -- (L);

% \draw[arrow](MMT) -- (Lx);

% \draw[arrow](L) -- (Lx);

%

% \draw[fill=red!40] (2,-.5) ellipse (3.2cm and .6cm);

% \node[color=red] at (-3.3,-.5) {Foundations};

% \node at (2,-.7) {\ldots};

%

% \draw[fill=blue!40] (0,-2.25) ellipse (1.9cm and .8cm);

%

% \node (H) at (0,-.5) {HOL Light};

% \node[color=blue!80] at (-3.5,-2) {HOL Light library};

% \node (B) at (-1,-2) {Bool};

% \node (A) at (1,-2) {Arith};

% \node (E) at (0,-2.5) {\ldots};

% \draw[arrow](L) -- (H);

% \draw[arrow](H) -- (B);

% \draw[arrow](H) -- (A);

% \draw[arrow](B) -- (A);

%

% \draw[fill=olive!40] (4,-2.25) ellipse (1.9cm and .8cm);

%

% \node (M) at (4,-.5) {PVS};

% \node[color=olive] at (-3.3,-2.5) {PVS library};

% \node (B') at (3,-2) {Booleans};

% \node (A') at (5,-2) {Reals};

% \node (E') at (4,-2.5) {\ldots};

%

% \node (A) at (1,-2) {Arith};

% \node (E) at (0,-2.5) {\ldots};

% \draw[arrow](Lx) -- (M);

% \draw[arrow](M) -- (B');

% \draw[arrow](M) -- (A');

% \draw[arrow](B') -- (A');

% \end{tikzpicture}

% \caption{A (Simplified) Theory Graph for the OAF Project}\label{fig:oaf}

We now generalize to view-finding between theories in different libraries (and built on different foundations and meta-theories). Obviously, various differences in available foundational primitives and library-specific best practices and idiosyncrasies can prevent the algorithm from finding desired matches. There are two approaches to increasing the number of results in these cases:

\begin{itemize}

\item In many instances, the translations between two foundations is too complex to be discovered purely syntactically. In these cases we can provide arbitrary meta-morphisms between theories\ednote{MK: starting alignments?} -- especially (but not exclusively) on the meta-theories -- which are applied before computing the abstract syntax trees (e.g. alignment-based translations as in \cite{MRLR:alignments:17})

\item We can apply additional foundation-specific normalizations before preprocessing theories, such as eliminating higher-order abstract syntax encodings or implicit arguments, or elaborating specific abbreviations/definitions.

\end{itemize}

The normalizations mentioned in Section \ref{sec:normalizeintra} already suggest equating the involved logical primitives (such as logical connectives) via a meta-morphism.

We now generalize to view finding to different libraries written in different logics.

Intuitively, the key idea is that we now have two fixed meta-theories $M$ and $M'$ and a fixed meta-view $m:M\to M'$.

However, due to the various idiosyncrasies of logics, tools' library structuring features, individual library conventions, this problem is significantly more difficult than \emph{intra}-library view finding.

For example, unless the logics are closely related, meta-views usually do not even exist and must be approximated.

Therefore, a lot of tweaking is typically necessary, and it is possible that multiple runs with different trade-offs give different interesting results.

Foundation-specific normalizations specifically for finding views \emph{across} libraries is to our knowledge an as-of-yet unexplored field of investigation. Every formal system has certain unique idiosyncrasies, best practices or widely used features; finding an ideal normalization method is a correspondingly difficult domain-specific problem.

As an example, we present a large case study where we find view from the MitM library used in the running example so far into the PVS/NASA library.

We will discuss some of our findings specifically regarding the PVS library as a case study.

\subsection{Normalization in PVS}\label{sec:normalizeinter}

PVS~\cite{pvs} is a proof assistant under active development, based on a higher-order logic with predicate subtyping and various convenience features such as record types, update expressions and inductive datatypes. In addition to the \emph{Prelude} library, which contains the most common domains of mathematical discourse and is shipped with PVS itself, there is a large library of formal mathematics developed and maintained by NASA~\cite{PVSlibraries:on}.

PVS~\cite{pvs} is a proof assistant under active development based on a higher-order logic with a number of advanced features.

In addition to the \emph{Prelude} library, which contains the most common domains of mathematical discourse and is shipped with PVS itself, there is a large library of formal mathematics developed and maintained by NASA~\cite{PVSlibraries:on}.

In \cite{KMOR:pvs:17}, we represent PVS as a meta-theory in MMT and implemented a translator that transforms both libraries into MMT format.

\paragraph{} While features like subtyping and records are interesting challenges, we will concentrate on one specific idiosyncrasy in PVS -- its prevalent use of \emph{theory parameters}.

\paragraph{Formula Normalization}

Naturally advanced features of the PVS type system such as predicate subtyping, record types, inductive types will not be part of views from MitM, which does not have corresponding features.

Therefore, our view finder can mostly ignore them.

In practice, theory parameters are often used in PVS for the signature of an abstract theory. For example, the theory of groups \cn{group\_def} in the NASA library has three theory parameters $(\cn T,\ast,\cn{one})$ for the signature, and includes the theory \cn{monoid\_def} with the same parameters; the axioms for a group are then formalized as a predicate on the theory parameters.

Instead, we only have to normalize PVS formulas to the extent that they use logical features that correspond to those of MitM.

This is particular the basic higher-order logic.

Thus, we use a meta-view that embeds MitM's higher-order logic into PVS's higher-order logic and make sure that we normalize PVS-formulas in the same way as MitM-formulas.

Given that the same practice is used in few other systems (if any), searching for views without treating theory parameters in some way will not give us any useful results on these theories. We offer three approaches to handling these situations:

\begin{enumerate}

\item\emph{Simple treatment:} We can interpret references to theory parameters as free variables and turn them into holes. Includes of parametric theories with arguments are turned into simple includes.

\item\emph{Covariant treatment:} We introduce new constants for the theory parameters and replace occurrences of the parameters by constant references. Includes with parameters are again replaced by normal includes.

\paragraph{Theory Structure Normalization}

However, PVS's complex and prevalently used parametric theories critically affect view finding because it affects the structure of theories.

For example, the theory of groups \cn{group\_def} in the NASA library has three theory parameters $(\cn T,\ast,\cn{one})$ for the signature of groups, and includes the theory \cn{monoid\_def} with the same parameters, and then declares the axioms for a group in terms of these parameters.

Without special treatment, we could only find views from/into libraries that use the same theory structure.

We have investigated three approaches of handling parametric theories:

\begin{compactenum}

\item\emph{Simple treatment:} We drop theory parameters and interpret references to them as free variables that match anything.

This is of course not sound so that all found views must be double-checked.

However, because practical search problems often do not require exact results, even returning all potential views can be useful.

\item\emph{Covariant elimination:} We treat theory parameters as if they were constants declared in the body.

In the above mentioned theory \cn{group\_def}, we would hence add three new constants \cn T, $\ast$ and \cn{one} with their corresponding types.

\item\emph{Contravariant treatment:} Theory parameters are eliminated by binding them as arguments to \emph{each constant in the theory}. References to the treated constants are replaced by applications of the symbols to the parameters of the original include.

This works well in the common case a parametric theory is not used with two different instantiations in the same context.

\item\emph{Contravariant elimination:}

The theory parameters are treated as if they were bound separately for every constant in the body of the theory.

In the above mentioned theory \cn{group\_def}, we would change e.g. the unary predicate \cn{inverse\_exists?} with type $T\to\cn{bool}$ to a function with type $(T : \cn{pvstype})\to(\ast : T \to T \to T)\to(\cn{one}:T)\to(T\to\cn{bool})$.

An include of $\cn{group\_def}(S,\circ,e)$ in some other theory, e.g. \cn{group}, would be replaced by a simple include, but occurrences of \cn{inverse\_exists?} in \cn{group} would be replaced by $\oma{\cn{inverse\_exists?}}{S,\circ,e}$.

\end{enumerate}

We have implemented the first two approaches. The first it is the most straight-forward but it lead to many false positives and negatives. The second approach is most useful for inter-library search, since it most closely corresponds to formalizations of abstract theories in other systems. A problem here is that the newly introduced constants are not passed on to includes with the same arguments without additional tranformations.

The third approach would turn every occurence of e.g. group-related symbols into function applications, which is a rather rare practice in most other systems. However, since this treatment of theory parameters comes closest to the semantics of the parameters, we conjecture that it is the most useful approach for intra-library view finding between PVS theories.

Additionally to the theory parameter related normalization, we use the following techniques:

\begin{itemize}

\item We curry function types $(A_1\times\ldots A_n)\to B$ to $A_1\to\ldots\to A_n\to B$. We treat lambda-expressions and applications accordingly.

For example, $f(\langle a, b\rangle)$ becomes $f(a,b)$ and $4=\lambda(n,m) : \mathbb N \times\mathbb N .\; n + m$ becomes $f =\lambda n : \mathbb N.\;\lambda m : \mathbb N .\; n + m$.

\item Higher-order abstract syntax encodings are eliminated by raising atomic types, function types, applications and lambdas to the level of the logical framework. This eliminates (redundant) implicit arguments that only occur due to their formalization in the logical framework.

For example, the apply-operator in the PVS meta-theory has the LF-type

Hence every function application in PVS -- as \mmt-term: $\oma{\cn{pvsapply}}{A,B,f,a}$ -- has two additional implicit arguments, which we eliminate by replacing this expression by $\oma{f}{a}$.

This prevents false negatives due to mismatching type arguments in the presence of subtyping.

\end{itemize}

% An include of $\cn{group\_def}(S,\circ,e)$ in some other theory, e.g. \cn{group}, would be replaced by a simple include, but occurrences of \cn{inverse\_exists?} in \cn{group} would be replaced by $\oma{\cn{inverse\_exists?}}{S,\circ,e}$.

This is closest to the actual semantics of the PVS module system.

But it makes finding interesting views the least likely because it is the most sensitive to the modular structure of individual theories.

\end{compactenum}

We have implemented the first two approaches.

The first it is the most straightforward but it leads to many false positives and false negatives.

We have found the second approach to be the most useful for inter-library search since it most closely corresponds to simple formalizations of abstract theories in other libraries.

% A problem here is that the newly introduced constants are not passed on to includes with the same arguments without additional tranformations.

The third approach will be our method of choice when investigating \emph{intra}-library views of PVS/NASA in future work.

% would turn every occurrence of e.g. group-related symbols into function applications, which is a rather rare practice in most other systems. However, since this treatment of theory parameters comes closest to the semantics of the parameters, we conjecture that it is the most useful approach for intra-library view finding between PVS theories.

\subsection{Implementation}\label{sec:pvs}

We have tried the first two approaches regarding theory parameters -- i.e. the simple and covariant treatments -- to both inter- and intra-library view finding problems. Concretely, we have used a simple theory of Monoids in the Math-in-the-Middle foundation and the theory of Monoids in the NASA library as domains with the whole NASA library as target. The results are summarized in Figure \ref{fig:pvsresults}.

We have applied the first two simple and the covariant approach described above.

We wrote a simple theory of monoids in the Math-in-the-Middle foundation and the theory of Monoids in the NASA library as domains with the whole NASA library as target.

The results are summarized in Figure \ref{fig:pvsresults}.